Pandas在合并多个CSV文件后向输出文件添加标题

ioekq8ef  于 2022-12-06  发布在  其他
关注(0)|答案(1)|浏览(135)
import pandas as pd
import os

file1 = 'https://public.fyers.in/sym_details/NSE_CM.csv'
file2 = 'https://public.fyers.in/sym_details/NSE_FO.csv'
file3 = 'https://public.fyers.in/sym_details/BSE_CM.csv'
CHUNK_SIZE = 10 ** 6
csv_file_list = [file1, file2, file3]
output_file = "/content/output.csv"

for csv_file_name in csv_file_list:
  skipRows = [2022,92805]
  chunk_container = pd.read_csv(csv_file_name, chunksize=CHUNK_SIZE, skiprows=skipRows)
  for chunk in chunk_container:
    headerList =["fytoken", "symbol", "instrumentType","lotSize","tickSize","ISIN","tradingSession","lastUpdate","expiryDate","symbolTicker","exchange","segment","scripCode","scripName","scripToken","strikePrice","optionType"]
    chunk.to_csv(output_file,header=headerList, mode="a", index=False)

我想合并这三个CSV文件并在输出文件中添加标题。但是它返回的输出文件在每个CSV的开头都有标题(在输出文件中)。

1zmg4dgp

1zmg4dgp1#

您将以块的形式阅读内容,并为每个块追加header
请尝试以下操作:

import pandas as pd

file1 = 'https://public.fyers.in/sym_details/NSE_CM.csv'
file2 = 'https://public.fyers.in/sym_details/NSE_FO.csv'
file3 = 'https://public.fyers.in/sym_details/BSE_CM.csv'
CHUNK_SIZE = 10 ** 6
csv_file_list = [file1, file2, file3]
output_file = "./content/output.csv"

headerList = ["fytoken", "symbol", "instrumentType", "lotSize", "tickSize", "ISIN", "tradingSession",
              "lastUpdate", "expiryDate", "symbolTicker", "exchange", "segment", "scripCode", "scripName",
              "scripToken", "strikePrice", "optionType"]

df = pd.DataFrame(columns=headerList)
df.to_csv(output_file, index=False)

for csv_file_name in csv_file_list:
    skipRows = [2022, 92805]
    with pd.read_csv(csv_file_name, chunksize=CHUNK_SIZE, skiprows=skipRows) as chunk_container:
        for chunk in chunk_container:
            chunk.to_csv(output_file, header=None, mode="a", index=False)

在这里,我们创建了一个csv文件,其中只有headers,并将从上述URL读取的数据附加到同一文件中。

相关问题