从字符串和panas dfs的组合创建.csv

drkbr07n  于 2023-04-09  发布在  其他
关注(0)|答案(1)|浏览(93)

我的.csv有多个区块需要遵循以下格式(1个区块样本):

所以尝试在pandas中完成,然后写入csv。问题是两个部分( Dataframe 之外)上面的注解。下面是示例代码:

import numpy as np
import pandas as pd
h_comment = pd.DataFrame(['#(H) Header'], columns=['name'])

df1 = pd.DataFrame({'name': 'Donald Trump',
                    'state':'FL',
                    'value':'0'},
                   index=[0])

data_comment =  pd.DataFrame(['#(S) Schedule'], columns=['A'])
df2 = pd.DataFrame(np.random.rand(3,4),
columns=list('ABCD'))

to_csv1 = pd.concat([h_comment,df1])
to_csv2 = pd.concat([data_comment,df2])

问题是那些“注解”在我的df列中,例如:

to_csv2
Out[116]: 
               A         B         C         D
0  #(S) Schedule       NaN       NaN       NaN
0       0.521739  0.622079  0.322372  0.687531
1       0.991336  0.297848  0.635697  0.025620
2       0.068900  0.898806  0.562971  0.567817

首先创建一个带有注解的.csv并将dfs附加到它的解决方案不是很好,因为有很多像上面这样的块会影响性能,所以我宁愿在最后写入csv。

r1zk6ea1

r1zk6ea11#

您共享的图像看起来更像Excel电子表格,而不是csv文件。
要创建与您描述的形状匹配的csv,一个选项是使用opento_csv

N = 2 # number of empty lines between both dfs

with open("output.csv", mode="w", newline="") as file:
    file.write("#(H) Header\n")
    df1.to_csv(file, index=False)
    file.write("\n"*N)
    file.write('#(S) Schedule\n')
    df2.to_csv(file, index=False)

输出(Excel中的 *.csv *):

如果需要,您可以使用ExcelWriter制作一个可以处理工作表/单元格格式的电子表格:

with pd.ExcelWriter("output.xlsx", engine="xlsxwriter") as writer:
    worksheet = writer.book.add_worksheet()
    
    header_format = writer.book.add_format({"border": None})
    title_format = writer.book.add_format({"bold": True,
                                           "italic": True,
                                           "font_size": 11})

    worksheet.write(0, 0, "#(H) Header", title_format)
    df1.to_excel(writer, index=False, startrow=1)
     
    worksheet.write(len(df1)+2, 0, "")
    worksheet.write(len(df1)+3, 0, "")
    
    worksheet.write(len(df1)+4, 0, "#(S) Schedule", title_format)
    df2.to_excel(writer, index=False, startrow=len(df1)+5)
    
    for col_num, value in enumerate(df2.columns):
        worksheet.write(len(df1)+5, col_num, value, header_format)
    
    for col_num, value in enumerate(df1.columns):
        worksheet.write(1, col_num, value, header_format)
        
    worksheet.autofit()

输出(.xlsx in Excel):

相关问题