python-3.x Pandas groupby，然后为每个组添加新行

6tdlim6h 于 2023-04-08 发布在 Python

关注(0)|答案(1)|浏览(105)

我有一个pandas DataFrame，我想按“id”列分组，然后在每个组的底部添加一个额外的行，其中该行的日期是该组最后一行的前一个工作日。

df = pd.DataFrame(data={'d': [datetime.date(2010,12,30), datetime.date(2010,12,31), datetime.date(2010,12,30),datetime.date(2010,12,31)], 'id': [1,1,2,2], 'val': [10,200, 90,420]})

我有：

Date      id    val
0    2010-12-30    1    10
1    2010-12-31    1    200
2    2010-12-30    2    90
3    2010-12-31    2    420

我想要：

Date         id    val
0    2010-12-30    1    10
1    2010-12-31    1    200
2    2011-01-01    1    NaN 
3    2010-12-30    2    90
4    2010-12-31    2    420
5    2011-01-01    2    Nan

这里显示的从零开始的解决方案似乎应该有效：
Pandas: add row to each group depending on condition
我试着让它适应我的情况，但就是不能让它工作：

def add_row(x):
    from pandas.tseries.offsets import BDay

    last_row = x.iloc[-1]
    last_row['Date'] = x.Date + BDay(1)
    
    return x.append(last_row)   

df.groupby('id').apply(add_row)

AttributeError: 'DataFrame' object has no attribute 'Date'

我不只是想解决这个特定的错误消息，我想解决这个问题。

python-3.x

来源：https://stackoverflow.com/questions/75958627/pandas-groupby-and-then-add-new-row-for-each-group

1条答案

按热度按时间

y4ekin9u1#

我会用途：

df['d'] = pd.to_datetime(df['d'])

out = pd.concat([df,
                 (df.loc[df.groupby('id')['d'].idxmax(), ['d', 'id']]
                    .assign(d=lambda x: x['d'].add(pd.DateOffset(days=1)))
                 )
                ]
                ).sort_index(kind='stable', ignore_index=True)

注意：对于工作日，将pd.DateOffset(days=1)替换为pd.offsets.BusinessDay(1)。*

输出：

d  id    val
0 2010-12-30   1   10.0
1 2010-12-31   1  200.0
2 2011-01-01   1    NaN
3 2010-12-30   2   90.0
4 2010-12-31   2  420.0
5 2011-01-01   2    NaN

备选方案：

last = df.groupby('id')['d'].idxmax()

out = df.loc[df.index.repeat(df.index.isin(last)+1)]

m = out.index.duplicated()

out.loc[m, 'd'] += pd.DateOffset(days=1)
out.loc[m, 'val'] = float('nan')

赞(0）回复(0）举报 2023-04-08

我来回答

python-3.x Pandas groupby，然后为每个组添加新行

1条答案

相关问题

热门标签

最新问答