替换Pandas数据列中的部分字符串，替换不起作用

fcg9iug3 于 2023-04-28 发布在其他

关注(0)|答案(1)|浏览(126)

我一直试图通过删除一部分文本来清理我的数据列。不幸的是，我无法理解它。
我试着使用。pandas系列中的replace方法，但似乎没有工作

df['Salary Estimate'].str.replace(' (Glassdoor est.)', '',regex=True)

0       $53K-$91K (Glassdoor est.)
1      $63K-$112K (Glassdoor est.)
2       $80K-$90K (Glassdoor est.)
3       $56K-$97K (Glassdoor est.)
4      $86K-$143K (Glassdoor est.)
                  ...             
922                             -1
925                             -1
928    $59K-$125K (Glassdoor est.)
945    $80K-$142K (Glassdoor est.)
948    $62K-$113K (Glassdoor est.)
Name: Salary Estimate, Length: 600, dtype: object

我所期待的是

0       $53K-$91K
1      $63K-$112K
2       $80K-$90K
3       $56K-$97K
4      $86K-$143K
                  ...             
922                             -1
925                             -1
928    $59K-$125K
945    $80K-$142K
948    $62K-$113K
Name: Salary Estimate, Length: 600, dtype: object`

pandas

来源：https://stackoverflow.com/questions/76057225/replace-a-part-of-string-in-pandas-data-column-replace-doesnt-work

1条答案

按热度按时间

iibxawm41#

如果启用正则表达式，则必须转义正则表达式符号，如(，)或.：

import re

>>> df['Salary Estimate'].str.replace(re.escape(r' (Glassdoor est.)'), '',regex=True)
0     $53K-$91K
1    $63K-$112K
2     $80K-$90K
3     $56K-$97K
4    $86K-$143K
Name: Salary Estimate, dtype: object

# Or without import re module
>>> df['Salary Estimate'].str.replace(r' \(Glassdoor est\.\)', '',regex=True)
0     $53K-$91K
1    $63K-$112K
2     $80K-$90K
3     $56K-$97K
4    $86K-$143K
Name: Salary Estimate, dtype: object

您还可以提取数字：

>>> df['Salary Estimate'].str.extract(r'\$(?P<min>\d+)K-\$(?P<max>\d+)K')
  min  max
0  53   91
1  63  112
2  80   90
3  56   97
4  86  143

赞(0）回复(0）举报 2023-04-28

我来回答

替换Pandas数据列中的部分字符串，替换不起作用

1条答案

相关问题

热门标签

最新问答