python 填写缺失的日期,逐行逐列添加列表,并在panda数据框中填写

xe55xuns  于 2023-02-28  发布在  Python
关注(0)|答案(1)|浏览(133)

我想在下面的数据框中填充一个范围内的日期,并填充所有列。当我完成此操作时,我想在“威尔斯”列中追加列表,以便它继续按日期添加项目。
按日期展开的数据框

StartDate     Wells                     count Sum_Cumm   vol
0   1967-10-01  [MUN-523, MUN-354, MUN-2660] 50   50    8503.323620
1   1968-01-01  [MUN-152]                     1   51    8336.591784
2   1968-03-01  [MUN-1032]                    1   52    8176.272712
3   1968-10-01  [MUN-16128]                   1   53    9191.110200

我正在编写的代码

newdf = (newdf.set_index('StartDate').reindex(pd.date_range('10-01-1967', '12-31-1994', freq='MS')).rename_axis(['StartDate']).reset_index()).ffill(newdf['vol'])

我希望最终使用的数据框架

StartDate   Wells                                        count Sum_Cumm    vol
0   1967-10-01  [MUN-523, MUN-354, MUN-2660]                     50   50    8503.323620
1   1967-11-01  [MUN-523, MUN-354, MUN-2660]                     1    51    8503.323620    
2   1967-12-01  [MUN-523, MUN-354, MUN-2660]                     1    51    8503.323620
3   1968-01-01  [MUN-523, MUN-354, MUN-2660,MUN-152]             1    52    8336.591784
4   1968-02-01  [MUN-523, MUN-354, MUN-2660,MUN-152]             1    53    8336.591784
5   1968-03-01  [MUN-523, MUN-354, MUN-2660,MUN-152,MUN-1032]    1    53    8176.272712
6   1968-04-01  [MUN-523, MUN-354, MUN-2660,MUN-152,MUN-1032]    1    53    8176.272712
c90pui9n

c90pui9n1#

您可以使用period_range创建一个新的索引,并重新索引现有的df以创建一个新的 Dataframe ,然后填充。
对于Wells列,执行cumsum,然后应用np.unique

new_idx = pd.to_datetime(
    pd.period_range(df["StartDate"].min(), df["StartDate"].max(), freq="M")
    .asfreq("D", how="S")
    .strftime("%Y-%m-%d")
)
df2 = df.set_index("StartDate").reindex(new_idx)
df2 = df2.ffill(downcast="infer")
df2["Wells"] = df2["Wells"].cumsum().apply(np.unique)
df2 = df2.rename_axis("StartDate").reset_index()

print(df2)

    StartDate                                              Wells  count  \
0  1967-10-01                       [MUN-2660, MUN-354, MUN-523]     50   
1  1967-11-01                       [MUN-2660, MUN-354, MUN-523]     50   
2  1967-12-01                       [MUN-2660, MUN-354, MUN-523]     50   
3  1968-01-01              [MUN-152, MUN-2660, MUN-354, MUN-523]      1   
4  1968-02-01              [MUN-152, MUN-2660, MUN-354, MUN-523]      1   
5  1968-03-01    [MUN-1032, MUN-152, MUN-2660, MUN-354, MUN-523]      1   
6  1968-04-01    [MUN-1032, MUN-152, MUN-2660, MUN-354, MUN-523]      1   
7  1968-05-01    [MUN-1032, MUN-152, MUN-2660, MUN-354, MUN-523]      1   
8  1968-06-01    [MUN-1032, MUN-152, MUN-2660, MUN-354, MUN-523]      1   
9  1968-07-01    [MUN-1032, MUN-152, MUN-2660, MUN-354, MUN-523]      1   
10 1968-08-01    [MUN-1032, MUN-152, MUN-2660, MUN-354, MUN-523]      1   
11 1968-09-01    [MUN-1032, MUN-152, MUN-2660, MUN-354, MUN-523]      1   
12 1968-10-01  [MUN-1032, MUN-152, MUN-16128, MUN-2660, MUN-3...      1   

    Sum_Cumm          vol  
0         50  8503.323620  
1         50  8503.323620  
2         50  8503.323620  
3         51  8336.591784  
4         51  8336.591784  
5         52  8176.272712  
6         52  8176.272712  
7         52  8176.272712  
8         52  8176.272712  
9         52  8176.272712  
10        52  8176.272712  
11        52  8176.272712  
12        53  9191.110200

相关问题