pandas 从短语中提取关键字并在python中求和成本-相当于excel中的sumif

wnavrhmk 于 2023-08-01 发布在 Python

关注(0)|答案(3)|浏览(78)

我有一个数据框架，其中包含几个单词的句子和句子的成本。我想把句子分成关键词，然后把包含这个词的句子的成本加起来。举例来说：
| 成本| Cost |
| --| ------------ |
| 二点五| 2.5 |
| 五| 5 |
结果如下：
| 成本| Cost |
| --| ------------ |
| 七点五| 7.5 |
| 七点五| 7.5 |
| 二点五| 2.5 |
| 二点五| 2.5 |
| 五| 5 |
把一个句子拆分成我用过的关键词：
第一个月
但为了做sumif我试过了但似乎不起作用

word["Cost"]=sentence['Sentence'].str.contains(row["keyword"], na = False) 
    ['cost'].sum()```

Can you think of a way to make it work?

字符串

pandas

来源：https://stackoverflow.com/questions/76760039/extract-keyword-from-a-phrase-and-sum-the-cost-in-python-equivalent-to-sumif-i

3条答案

按热度按时间

yacmzcpb1#

完整代码

(df.assign(Sentence=df['Sentence'].str.split(' '))
   .explode('Sentence')
   .groupby('Sentence').sum())

字符串
产出：

Cost
Sentence    
Hi      7.5
are     2.5
how     2.5
there   5.0
you     7.5

型

第一步

拆分字Sentence column

df.assign(Sentence=df['Sentence'].str.split(' '))

型
产出：

Sentence            Cost
0   [Hi, how, are, you] 2.5
1   [Hi, you, there]    5.0

型

第二步

步骤1分解结果

df.assign(Sentence=df['Sentence'].str.split(' ')).explode('Sentence')

型
产出：

Sentence    Cost
0   Hi          2.5
0   how         2.5
0   are         2.5
0   you         2.5
1   Hi          5.0
1   you         5.0
1   there       5.0

型

第三步

groupby步骤2的结果。与完整代码相同

赞(0）回复(0）举报 2023-08-01

bbmckpt72#

['cost']没有声明-它只是语法不正确。这个变量叫做word。
如果你想在word中添加所有内容，你可以这样做：

costs = 0
for sentence, cost in word.items():
    costs += cost

字符串
根据cost的数据类型，你可能需要添加一个从str到float的转换。

赞(0）回复(0）举报 2023-08-01

qij5mzcb3#

最有效的方法是避免使用pandas而依赖纯python（当使用字符串操作时，pandas效率不是很高）：

d = {}
for s, c in zip(df['Sentence'], df['Cost']):
    for word in s.split():
        d[word] = d.get(word, 0) + c

out = pd.DataFrame({'Sentence': out.keys(), 'Cost': out.values()})

# or as Series:
# out = pd.Series(d)

字符串
输出量：

Sentence  Cost
0       Hi   7.5
1      how   2.5
2      are   2.5
3      you   7.5
4    there   5.0

型
可再现输入：

df = pd.DataFrame({'Sentence': ['Hi how are you', 'Hi you there'],
                   'Cost': [2.5, 5.0]})

型

赞(0）回复(0）举报 2023-08-01

我来回答

pandas 从短语中提取关键字并在python中求和成本-相当于excel中的sumif

3条答案

相关问题

热门标签

最新问答