SentenceTransformers在Pandas系列上抛出KeyError

0g0grzrc  于 2023-01-24  发布在  其他
关注(0)|答案(2)|浏览(183)

我使用下面的简化代码:

from sentence_transformers import SentenceTransformer
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')

embeddings = model.encode(sentences)

其中sentences是一个PandasSeries,包含我想要转换的句子。
然后我得到了以下错误Traceback

embeddings = model.encode(sentences)
File "/anaconda/envs/topics/lib/python3.8/site-packages/sentence_transformers/SentenceTransformer.py", line 157, in encode
sentences_sorted = [sentences[idx] for idx in length_sorted_idx]
File "/anaconda/envs/topics/lib/python3.8/site-packages/sentence_transformers/SentenceTransformer.py", line 157, in <listcomp>
sentences_sorted = [sentences[idx] for idx in length_sorted_idx]
File "/anaconda/envs/topics/lib/python3.8/site-packages/pandas/core/series.py", line 942, in 
__getitem__
return self._get_value(key)
File "/anaconda/envs/topics/lib/python3.8/site-packages/pandas/core/series.py", line 1051, in 
_get_value
loc = self.index.get_loc(label)
File "/anaconda/envs/topics/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3363, in get_loc
raise KeyError(key) from err
KeyError: 144
wsewodh2

wsewodh21#

实际的解决方案是将PandasSeries转换为numpy数组:

sentences_array = sentences.to_numpy()
omqzjyyz

omqzjyyz2#

该嵌入式将给予在数组或Tensor形式的形式,所以使用下面的代码来解决这个问题
嵌入=模型.编码(句子,转换为Tensor=真)
[或]
嵌入=模型.编码(句子,转换为numpy=True)

相关问题