python 上采样Pandas数据右索引(包括)

jvidinwx  于 2023-02-21  发布在  Python
关注(0)|答案(1)|浏览(103)

我有以下数据,我想将其重新采样(上采样)为有时30分钟间隔、有时15分钟间隔、有时5分钟间隔

TIME              VALUE
0 2023-01-02 01:00:00              94.73
1 2023-01-02 02:00:00              95.30
2 2023-01-02 03:00:00              67.16

但是,如果我使用pandas .resample()方法,最后一个索引上的上采样不执行。有什么方法可以实现这一点?
我尝试过的:

>>> df.set_index('TIME').resample('30T').ffill()
                   TIME              VALUE
0  2023-01-02 01:00:00              94.73
1  2023-01-02 01:30:00              94.73
2  2023-01-02 02:00:00              95.30
3  2023-01-02 02:30:00              95.30
4  2023-01-02 03:00:00              67.16

我想要的:

TIME              VALUE
0  2023-01-02 01:00:00              94.73
1  2023-01-02 01:30:00              94.73
2  2023-01-02 02:00:00              95.30
3  2023-01-02 02:30:00              95.30
4  2023-01-02 03:00:00              67.16
5  2023-01-02 03:30:00              67.16
sxpgvts3

sxpgvts31#

只需执行以下操作:

import pandas as pd

data = {'TIME': ['2023-01-02 01:00:00', '2023-01-02 02:00:00', '2023-01-02 03:00:00'],
        'VALUE': [94.73, 95.30, 67.16]}
df = pd.DataFrame(data)

df.set_index('TIME', inplace=True)
df.index = pd.to_datetime(df.index)

df = df.resample('30T', closed='right', label='right').ffill()

last_index = df.index[-1]
if last_index + pd.Timedelta('30 min') not in df.index:
    last_row = pd.DataFrame({'VALUE': df.iloc[-1, 0]}, index=[last_index + pd.Timedelta('30 min')])
    df = pd.concat([df, last_row])

df = df.reset_index()

print(df)

其给出:

index  VALUE
0 2023-01-02 01:00:00  94.73
1 2023-01-02 01:30:00  94.73
2 2023-01-02 02:00:00  95.30
3 2023-01-02 02:30:00  95.30
4 2023-01-02 03:00:00  67.16
5 2023-01-02 03:30:00  67.16

相关问题