我一直试图通过删除一部分文本来清理我的数据列。不幸的是,我无法理解它。
我试着使用。pandas系列中的replace方法,但似乎没有工作
df['Salary Estimate'].str.replace(' (Glassdoor est.)', '',regex=True)
0 $53K-$91K (Glassdoor est.)
1 $63K-$112K (Glassdoor est.)
2 $80K-$90K (Glassdoor est.)
3 $56K-$97K (Glassdoor est.)
4 $86K-$143K (Glassdoor est.)
...
922 -1
925 -1
928 $59K-$125K (Glassdoor est.)
945 $80K-$142K (Glassdoor est.)
948 $62K-$113K (Glassdoor est.)
Name: Salary Estimate, Length: 600, dtype: object
我所期待的是
0 $53K-$91K
1 $63K-$112K
2 $80K-$90K
3 $56K-$97K
4 $86K-$143K
...
922 -1
925 -1
928 $59K-$125K
945 $80K-$142K
948 $62K-$113K
Name: Salary Estimate, Length: 600, dtype: object`
1条答案
按热度按时间iibxawm41#
如果启用正则表达式,则必须转义正则表达式符号,如
(
,)
或.
:您还可以提取数字: