在Python中，阅读和解析csv然后打印-为什么它的行为与我预期的不同？

ny6fqffe 于 2023-05-04 发布在 Python

关注(0)|答案(1)|浏览(130)

下面是一些代码：

trained_nlp = spacy.load("models/output/model-best")

with open('Dog_Breed.csv') as f:
    s = f.read() # + '\n'  add trailing new line character

print('Here is the csv as string')
print(s)

doc = trained_nlp(s)

for ent in doc.ents:
    if ent.label_ is not None and ent.label_=='BREED' :
        sentence_breed = ' is a '
        print(ent.text + sentence_breed  + ent.label_ + ' and ')
    if ent.text is not None and ent.label_ == 'ORIGIN' :
        sentence_origin = ' has an '
        print ( sentence_origin + ent.label_ + ' of ' + ent.text)

这是打印出来的东西

Here is the csv as string
,Dog Breed,Origin
0,Boykin Spaniel,South Carolina
1,German Shepherd,Germany
2,Border Collie,Scotland

 has an ORIGIN of ,Dog
Breed, is a BREED and 
 has an ORIGIN of Origin

0,Boykin Spaniel is a BREED and 
 has an ORIGIN of ,South
1,German Shepherd is a BREED and 
 has an ORIGIN of ,
 has an ORIGIN of Germany
2,Border Collie is a BREED and 
 has an ORIGIN of ,
 has an ORIGIN of Scotland

一些异常：

spacy模型标签似乎总是存在-但我想忽略第一个链接，“，狗品种，起源”，因为没有匹配。你可以看到我一直在使用None关键字来解决这个问题
1.请注意，“南卡罗来纳州”被截断，但“博伊金犬”没有？！？尽管如此，第一个输出是格式良好的-我得到了一个我想要的句子，比如“X是一个品种，并且具有Y的起源”，没有重复。
1.但是，对于第1行和第2行，我得到了“has an ORIGIN of，”的哑行。
我更擅长理解语言模型，而不是Python中棘手的内部循环;- ）
我还以为是这样的：

0,Boykin Spaniel is a BREED and has an ORIGIN of South Carolina
1,German Shepherd is a BREED and has an ORIGIN of Germany
2,Border Collie is a BREED and has an ORIGIN of Scotland

csv

来源：https://stackoverflow.com/questions/76157053/in-python-reading-parsing-a-csv-then-printing-why-is-it-behaving-differentl

1条答案

按热度按时间

qyyhg6bp1#

大多数问题是由于不理解trained_nlp()如何解析csv字符串。很明显，它解析csv str的方式与您预期的不同。
您应该调试trained_nlp()以了解它是如何准确地解析csv字符串的。
我期待着这样的东西：
print("message")默认通过一个默认参数添加一个换行符，即默认情况下它是print("message", end="\n")。
要改变这一点：print("message", end='')

赞(0）回复(0）举报 2023-05-04

我来回答

在Python中，阅读和解析csv然后打印-为什么它的行为与我预期的不同？

1条答案

相关问题

热门标签

最新问答