在pandas中使用groupby和cumsum获取新列

vnjpjtjt  于 12个月前  发布在  其他
关注(0)|答案(4)|浏览(88)

我有以下dataframe:
| 类|接收|发布|
| --|--|--|
| FD| 10 | 0 |
| FD| 0 | 2 |
| RM| 5 | 0 |
| RM| 0 | 3 |
| FD| 0 | 2 |
| 下午| 5 | 0 |
| 下午| 1 | 0 |
| RM| 1 | 0 |
| FD| 4 | 0 |
我需要下面的dataframe:
| 类|接收|发布|剩余量|
| --|--|--|--|
| FD| 10 | 0 | 10 |
| FD| 0 | 2 | 8 |
| RM| 5 | 0 | 5 |
| RM| 0 | 3 | 2 |
| FD| 0 | 2 | 6 |
| 下午| 5 | 0 | 5 |
| 下午| 1 | 0 | 6 |
| RM| 1 | 0 | 3 |
| FD| 4 | 0 | 10 |
剩余数量栏为每类收发的累计数量()。我试过不同的方法,但我没有得到它。

vkc1a9a2

vkc1a9a21#

另一个解决方案:

df["Remaining Quatity"] = (g := df.groupby("Class").cumsum())["Received"] - g["Issued"]
print(df)

图纸:

Class  Received  Issued  Remaining Quatity
0    FD        10       0                 10
1    FD         0       2                  8
2    RM         5       0                  5
3    RM         0       3                  2
4    FD         0       2                  6
5    PM         5       0                  5
6    PM         1       0                  6
7    RM         1       0                  3
8    FD         4       0                 10

OR:使用.pipe

df["Remaining Quatity"] = df.groupby("Class").cumsum().pipe(lambda g: g["Received"] - g["Issued"])

OR:使用.eval

df["Remaining Quatity"] = df.groupby("Class").cumsum().eval("Received - Issued")
cigdeys3

cigdeys32#

df['Remaining Quantity'] = df.groupby('Class').apply(
    lambda x: x['Received'].cumsum() - x['Issued'].cumsum()
    ).reset_index(level=0, drop=True)

输出量:

Class  Received  Issued  Remaining Quantity
0    FD        10       0                  10
1    FD         0       2                   8
2    RM         5       0                   5
3    RM         0       3                   2
4    FD         0       2                   6
5    PM         5       0                   5
6    PM         1       0                   6
7    RM         1       0                   3
8    FD         4       0                  10
wlsrxk51

wlsrxk513#

另一种可能的解决方案:

df["Remaining Quatity"] = (
    df.eval("tmp=Received-Issued").groupby("Class")["tmp"].cumsum()
)

输出量:

print(df)

  Class  Received  Issued  Remaining Quatity
0    FD        10       0                 10
1    FD         0       2                  8
2    RM         5       0                  5
3    RM         0       3                  2
4    FD         0       2                  6
5    PM         5       0                  5
6    PM         1       0                  6
7    RM         1       0                  3
8    FD         4       0                 10
qxsslcnc

qxsslcnc4#

一种方法是使用.stack计算差值,然后沿着索引将值赋值。

df['Remaining Quality'] = df.assign(
            Issued=df['Issued'] * -1).set_index('Class',append=True)\
           .stack().groupby(level=1).cumsum().unstack(-1).droplevel(1,0)['Issued']

print(df)

  Class  Received  Issued  Remaining Quality
0    FD        10       0                 10
1    FD         0       2                  8
2    RM         5       0                  5
3    RM         0       3                  2
4    FD         0       2                  6
5    PM         5       0                  5
6    PM         1       0                  6
7    RM         1       0                  3
8    FD         4       0                 10

相关问题