字符串重复时填充Pandas列中的相邻值

vmdwslir  于 2022-11-27  发布在  其他
关注(0)|答案(1)|浏览(112)

当名为“Keyword”的列中的值与相邻值重复时,我试图覆盖名为“Group”的列中的值。
例如,因为字符串“commercial office cleaning services”是重复的,所以我想将相邻的列覆盖为“commercial cleaning services”。

示例数据

所需输出

最小可重现示例

import pandas as pd

data = [
    ["commercial cleaning services", "commercial cleaning services"],
    ["commercial office cleaning services", "commercial cleaning services"],
    ["janitorial cleaning services", "commercial cleaning services"],
    ["commercial office services", "commercial cleaning"],
]
df = pd.DataFrame(data, columns=["Keyword", "Group"])
print(df)

我对Pandas还很陌生,不知道从哪里开始,我已经到了一个死胡同,谷歌和搜索堆栈溢出。

taor4pac

taor4pac1#

IIUC,将duplicatedmaskffill配合使用:

#is the keyword duplicated ?
m = df['Keyword'].duplicated()

df['Group'] = df['Group'].mask(m).ffill()
#输出:
print(df)

                               Keyword                         Group
0         commercial cleaning services  commercial cleaning services
1  commercial office cleaning services  commercial cleaning services
2         janitorial cleaning services  commercial cleaning services
3  commercial office cleaning services  commercial cleaning services

相关问题