如何在Pandas中添加带有连续日期时间索引的空/哑行?

btxsgosb  于 2022-11-27  发布在  其他
关注(0)|答案(1)|浏览(176)

这是我的数据框

consumption  hour
start_time
2022-09-30 14:00:00+02:00            199.0  14.0
2022-09-30 15:00:00+02:00            173.0  15.0
2022-09-30 16:00:00+02:00            173.0  16.0
2022-09-30 17:00:00+02:00            156.0  17.0
2022-09-30 18:00:00+02:00            142.0  18.0
2022-09-30 19:00:00+02:00            163.0  19.0
2022-09-30 20:00:00+02:00            138.0  20.0
2022-09-30 21:00:00+02:00            183.0  21.0
2022-09-30 22:00:00+02:00            138.0  22.0
2022-09-30 23:00:00+02:00            143.0  23.0

我想这样出去

consumption  hour
start_time
2022-09-30 14:00:00+02:00            199.0  14.0
2022-09-30 15:00:00+02:00            173.0  15.0
2022-09-30 16:00:00+02:00            173.0  16.0
2022-09-30 17:00:00+02:00            156.0  17.0
2022-09-30 18:00:00+02:00            142.0  18.0
2022-09-30 19:00:00+02:00            163.0  19.0
2022-09-30 20:00:00+02:00            138.0  20.0
2022-09-30 21:00:00+02:00            183.0  21.0
2022-09-30 22:00:00+02:00            138.0  22.0
2022-09-30 23:00:00+02:00            143.0  23.0
*2022-09-31 00:00:00+02:00           00.0   00.0*
*2022-09-31 01:00:00+02:00           00.0   01.0*

这里我的索引是datetime(start_time),我想创建一个连续的datetime行,值为dummy或zero。

2izufjch

2izufjch1#

通过concat创建辅助数据框并添加到原始数据框:

N = 2
df1 = (pd.DataFrame({'consumption':0}, 
                     index=pd.date_range(df.index.max() + pd.Timedelta('1h'),
                           df.index.max() + pd.Timedelta(f'{N}h'),
                           freq='H'))
          .assign(hour=lambda x: x.index.hour))

df = pd.concat([df, df1])
print (df)
                           consumption  hour
2022-09-30 14:00:00+02:00        199.0  14.0
2022-09-30 15:00:00+02:00        173.0  15.0
2022-09-30 16:00:00+02:00        173.0  16.0
2022-09-30 17:00:00+02:00        156.0  17.0
2022-09-30 18:00:00+02:00        142.0  18.0
2022-09-30 19:00:00+02:00        163.0  19.0
2022-09-30 20:00:00+02:00        138.0  20.0
2022-09-30 21:00:00+02:00        183.0  21.0
2022-09-30 22:00:00+02:00        138.0  22.0
2022-09-30 23:00:00+02:00        143.0  23.0
2022-10-01 00:00:00+02:00          0.0   0.0
2022-10-01 01:00:00+02:00          0.0   1.0

或者使用DataFrame.reindex和新索引,并添加N小时数:

N = 2
df = (df.reindex(pd.date_range(df.index.min(), 
                               df.index.max() + pd.Timedelta(f'{N}h'), 
                               freq='H'), fill_value=0)
        .assign(hour=lambda x: x.index.hour))

print (df)
                           consumption  hour
2022-09-30 14:00:00+02:00        199.0    14
2022-09-30 15:00:00+02:00        173.0    15
2022-09-30 16:00:00+02:00        173.0    16
2022-09-30 17:00:00+02:00        156.0    17
2022-09-30 18:00:00+02:00        142.0    18
2022-09-30 19:00:00+02:00        163.0    19
2022-09-30 20:00:00+02:00        138.0    20
2022-09-30 21:00:00+02:00        183.0    21
2022-09-30 22:00:00+02:00        138.0    22
2022-09-30 23:00:00+02:00        143.0    23
2022-10-01 00:00:00+02:00          0.0     0
2022-10-01 01:00:00+02:00          0.0     1

相关问题