Python Pandas以奇怪的格式读取_csv

z9ju0rcb  于 2023-09-28  发布在  Python
关注(0)|答案(1)|浏览(93)

是否可以读取_csv文件,该文件具有双引号作为值,并且字段被双引号包裹以忽略逗号作为值?示例文件如下所示:

"fie,ld1","fi"e,ld2","field3"
"test","testing","meow"

所需输出如下:

fie,ld1 fi"e,ld2 field3
test    testing  meow

我试过各种read_csv选项,聊天gpt,网页搜索。

3hvapo4f

3hvapo4f1#

我想你是说“,”是这个文档中的分隔符,你想清理数据。
以下代码将打印:

fie,ld1 fi"e,ld2 field3
test testing meow

然后可以将输出附加到“干净”CSV文件。

import re

f = open("strangeFile.csv", "r")
lines = f.read().splitlines() #split file into a list of lines
f.close()

for line in lines:
    items = line.split("\",\"") #split by 'strange' delimiter
    line = " ".join(items)  # join with a space, comma or tab
    line = re.sub("^\"|\"$","",line) #remove open and close quote marks

    print(line)
    #append line to a new csv file

相关问题