ValueError:[E1041]输入的应为字符串、文档或字节,但得到的却是:〈class 'Pandas.核心.系列.系列'>

kqhtkvqz  于 2022-12-02  发布在  其他
关注(0)|答案(1)|浏览(301)
import pandas
df['findings'] = df['findings'].astype(str)
#df['findings'] = df['findings'].astype('string')
df["new_column"] = GPT2_model(df['findings'], min_length=60)

运行后,我得到了以下错误,即使在我的 Dataframe 转换为字符串。

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-37-1225bf7a7a14> in <module>
----> 1 df["new_column"] = GPT2_model(df['findings'], min_length=60)

5 frames
/usr/local/lib/python3.7/dist-packages/spacy/language.py in _ensure_doc(self, doc_like)
  1106         if isinstance(doc_like, bytes):
  1107             return Doc(self.vocab).from_bytes(doc_like)
-> 1108         raise ValueError(Errors.E1041.format(type=type(doc_like)))
  1109 
  1110     def _ensure_doc_with_context(

ValueError: [E1041] Expected a string, Doc, or bytes as input, but got: <class 'pandas.core.series.Series'>
nfs0ujit

nfs0ujit1#

您的方法/模型GPT2_model没有接受Pandas Series对象。这就是错误所抱怨的。您可以将apply方法改为findings列。

df['new_column'] = df['findings'].apply(GPT2_model, min_length=60)

相关问题