pandas 如何根据条件提取数据并存储到多个文件中[duplicate]

x4shl7ld  于 2023-02-27  发布在  其他
关注(0)|答案(1)|浏览(154)
    • 此问题在此处已有答案**:

Split pandas dataframe into multiple dataframes with equal numbers of rows(2个答案)
6小时前关门了。
test.csv

name,age,n1,n2,n3
a,21,1,2,3
b,22,4,9,0
c,25,4,5,6
d,25,41,5,6
e,25,4,66,6
f,25,4,5,66
g,25,4,55,6
h,25,4,5,56
i,25,41,5,61
j,25,4,51,60
k,20,40,50,60
l,21,40,51,60

我的代码直到读取并存储到dict

import pandas as pd

input_file = pd.read_csv("test.csv")
for i in range(0, len(input_file['name'])):   
    dict1 = {}
    dict1["name"] = str(input_file['name'][i])
    dict1["age"] = str(input_file['age'][i])
    dict1["n1"] = str(input_file['n1'][i])
    dict1["n2"] = str(input_file['n2'][i])
    dict1["n3"] = str(input_file['n3'][i])

我想在多个文件中为每5行数据生成输出(但这需要使用Python中的writeline函数,因为我需要在writeline中做许多事情。文件名应动态生成,输入也将是动态的(意味着更多的行可以来)
示例或预期输出(herre文件名必须是动态的)

out_file = open('File1.xml', 'w')
out_file.writelines(I will process with dictionary data row by row)
out_file.writelines("\n")

文件1

a,21,1,2,3
b,22,4,9,0
c,25,4,5,6
d,25,41,5,6
e,25,4,66,6

文件2

f,25,4,5,66
g,25,4,55,6
h,25,4,5,56
i,25,41,5,61
j,25,4,51,60

文件3

k,20,40,50,60
l,21,40,51,60
kuuvgm7e

kuuvgm7e1#

如果默认为RangeIndex,则可以在groupby中循环,并使用整数除以组数:

input_file = pd.read_csv("test.csv")

N = 5
for name, g in input_file.groupby(input_file.index // N): 
    g.to_csv(f'file_{name}.csv', ignore_index=True, header=False)
N = 5
for name, g in input_file.groupby(np.arange(len(input_file)) // N): 
    g.to_csv(f'file_{name}.csv', ignore_index=True, header=False)

编辑:如果确实需要逐行写入,请使用:

N = 5
for name, g in input_file.groupby(input_file.index // N): 
    with open(f'File{name+1}.xml', 'w') as out_file:
        for data in g.to_numpy():
            out_file.write(','.join(str(x) for x in data))
            out_file.write('\n')

相关问题