如何使用pyspellchecker自动更正Pandas栏目中的拼写错误？

euoag5mw 于 2023-01-28 发布在其他

关注(0)|答案(1)|浏览(152)

我有以下 Dataframe ：

df = pd.DataFrame({'id':[1,2,3],'text':['a foox juumped ovr the gate','teh car wsa bllue','why so srious']})

我想使用pyspellchecker库生成一个新的列，并修复拼写错误。
我尝试了以下方法，但没有纠正任何拼写错误：

import pandas as pd
from spellchecker import SpellChecker

spell = SpellChecker()

def correct_spelling(word):
    corrected_word = spell.correction(word)
    if corrected_word is not None:
        return corrected_word
    else:
        return word

df['corrected_text'] = df['text'].apply(correct_spelling)

以下是预期输出的 Dataframe '

pd.DataFrame({'id':[1,2,3],'text':['a foox juumped ovr the gate','teh car wsa bllue','why so srious'],
              'corrected_text':['a fox jumped over the gate','the car was blue','why so serious']})
              `

pandas

来源：https://stackoverflow.com/questions/75215866/how-to-use-pyspellchecker-to-autocorrect-spelling-errors-in-a-pandas-column

1条答案

按热度按时间

webghufk1#

我对这个包一无所知（如何修正精度），但是你可以把每一行的字符串拆分成一个列表，然后遍历列表的列表。

df["text"] = [[spell.correction(word) for word in row] for row in df["text"].str.split(" ").to_list()]
df["text"] = df["text"].apply(lambda x: " ".join(x))

输出（如您所见，您需要提高精度）：

id                       text
0   1  a food jumped or the gate
1   2           the car was blue
2   3             why so serious

赞(0）回复(0）举报 2023-01-28

我来回答

如何使用pyspellchecker自动更正Pandas栏目中的拼写错误？

1条答案

相关问题

热门标签

最新问答