如何处理带有双引号的Pandas CSV字符串

bmvo0sr5  于 2023-09-27  发布在  其他
关注(0)|答案(1)|浏览(133)

我在处理CSV文件时遇到问题,像这样的格式:
输入.csv文件:

1,abc,65.0,en-GB,"reverted,Knowledge Alert,ab00998978,1,Y,Y,default,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
2,zolhOgdmpwAQjfaUONdTD7,15.0,en-GB,"New & Dropped Routes,Article,KM100015050,2,N,Y,default,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
3,zolhOgdmpwAQjfaUONdTD7,4.0,en-GB,"New & Dropped Routes,Article,KM100015050,3,N,Y,default,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

元数据.csv文件

tablename,silver_tablename,fileformat,fileformat_historical,silver_overwrite,preprocessing,cdc,bronzeload,silverload,softdelete,harddelete,harddeleteonlykeys,deletecolumn,deletevalue,executionset,readeroptions,autoloaderoptions,autoloaderoptions_historical,enabled,encoding_format,bronze_overwrite,sensitive_data,data_protection
agent,agent,csv,csv,N,Y,N,Y,Y,Y,N,N,OPERATION,D,set1,"{'header': 'true', 'sep': 'chr(1)', 'quoting':'csv.QUOTE_NONE','readerCaseSensitive': 'false'}",,,Y,UTF-8,N,N,N

但是在运行我的代码后,双引号后的所有数据都变成了单列。
我的代码:

import pandas as pd

entity_df = pd.read_csv(entity_control_source_path, header=0, sep=",", quotechar='"', dtype=str)

mhd8tkvw

mhd8tkvw1#

你可以通过将quoting=3传递给pd.read_csv()来让pandas完全忽略引号:

df = pd.read_csv('input.csv', quoting=3, header=None)

参见official documentation
或者,您可以事先手动清理数据,并像定期那样读取数据。

相关问题