regex Python replace函数不工作(我将返回值赋回变量)[关闭]

7vhp5slm  于 12个月前  发布在  Python
关注(0)|答案(1)|浏览(122)

已关闭此问题为not reproducible or was caused by typos。它目前不接受回答。

此问题是由打印错误或无法再重现的问题引起的。虽然类似的问题可能是on-topic在这里,这一个是解决的方式不太可能帮助未来的读者。
上个月关门了。
Improve this question
这是我在应用replace之前的DataFrame:df before replace
这是我的代码:

movies_df['year'] = movies_df.title.str.extract('(\(\d\d\d\d\))',expand=False)
movies_df['year'] = movies_df.year.str.extract('(\d\d\d\d)',expand=False)

movies_df['title'] = movies_df['title'].str.replace('(\(\d\d\d\d\))', '')
movies_df['title'] = movies_df['title'].apply(lambda x: x.strip())

movies_df.head()

我使用正则表达式来选择标题列中包含括号和四位数字的部分。
正如你所看到的,我把replace()的返回值赋给了movies_df['title'],但它仍然不起作用:df after replace

qgzx9mmu

qgzx9mmu1#

你可以这样做来创建一个自己的列,并删除str的一部分:

import pandas as pd

col = ['movieID','title','genre']
no = ['1','2','3']
movie = ['M1 (1961)','M2 (1965)','M3 (1978)']
genre = ['comedy |fantasy','comedy','fantacy']

df = pd.DataFrame(list(zip(no, movie, genre)), columns=col)
print(df, '\n')

df['year'] = df['title'].str.slice(-6, )
df['title']= df['title'].str.replace("\(\d+\)", " ", regex=True)
print(df)

输出量:

movieID      title            genre
0       1  M1 (1961)  comedy |fantasy
1       2  M2 (1965)           comedy
2       3  M3 (1978)          fantacy 

  movieID title            genre    year
0       1  M1    comedy |fantasy  (1961)
1       2  M2             comedy  (1965)
2       3  M3            fantacy  (1978)

相关问题