如何创建一个唯一的详细信息列,条件是类型列中的fruit后跟fruit -2。detail 1或detail 2可以是NaN
df type detail1 detail2 name
0 fruit apple
1 fruit -2 best best apple
2 yellow yellowish apple
3 green apple
4 fruit banana
5 sub
6 fruit -2 best best banana
7 yellow orange banana
8 green brown banana
预期输出
df type detail1 detail2 name unique_detail
0 fruit apple [best, yellow, yellowish, green ]
1 fruit -2 best best apple [best, yellow, yellowish, green ]
2 yellow yellowish apple [best, yellow, yellowish, green ]
3 green apple [best, yellow, yellowish, green brown]
4 fruit banana sub: [yellow, orange, green, brown]
5 sub
6 fruit -2 banana sub:[yellow, orange, green, brown]
7 yellow orange banana sub:[yellow, orange, green, brown]
8 green brown banana sub:[yellow, orange, green, brown]
我试过了
m = df.type.eq("fruit") & df.type.shift(-1).ne("fruit -2")
df["detail"] = df.detail1 + df.detail2
df["detail"] = df.groupby("type").transform("unique")
df["detail"] = df["detail"].mask(m, "sub:"+df.detail)
1条答案
按热度按时间avwztpqn1#
确切的逻辑并不完全清楚,但是您应该使用
groupby.apply
的自定义函数:您也可以使用作为石斑鱼:
输出量: