csv 使用Python从一个文件夹及以后的文件夹中执行多个文件以计算平均值

0ve6wy6x  于 2023-06-19  发布在  Python
关注(0)|答案(2)|浏览(143)

我在一个文件夹中有多个.csv文件,其中每个文件包含几个带有整数值的列。我想写一个专栏(例如。第1)和以后,想找到该列的和。最后是所有文件的平均值,意思是“所有文件的总和/文件的数量”。我开始如下,

import os

folder_path = r'folder\path'
total_sum = 0
file_count = 0
for file_name in os.listdir(folder_path):
    if file_name.endswith('.csv'):
        file_path = os.path.join(folder_path, file_name)
        with open(file_path, 'r') as file:
            df=file.readlines()   #not sure
            df.columns=['A', 'B', 'C']
            df1 = df['A'].sum()   #not sure
            total_sum += df1
            file_count += 1
            print(f'Sum for {file_name}: {df1}')

average = total_sum / file_count
print(f'Average of all accuracy: {average}')

但是,我无法读取这些.csv文件,从而执行其余的代码。一个小提示或帮助将不胜感激。

7cjasjjr

7cjasjjr1#

您几乎已经完成了,但是您只需要使用pandas来简化处理CSV文件。下面是一个代码片段,可以帮助您入门。

import os
import pandas as pd

folder_path = ... # Enter your absolute/relative folder path here
total_sum = 0
file_count = 0

# Loop through all files in the directory
for file_name in os.listdir(folder_path):
    # NOTE: if some files in the folder are not CSV, can add an error check.
    
    # Read in the file into a pandas dataframe
    df = pd.read_csv(os.path.join(folder_path, file_name))
    
    # NOTE: you should add an error check for column_name in the df
    column_name = ... # Enter the column you want to sum over
    file_sum = df[column_name].sum()

    total_sum += file_sum
    file_count += 1

    print(f"Sum for {file_name}: {file_sum}")

if file_count == 0:
    average = 0.
else:
    average = total_sum / file_count
print(f"Average of all accuracy: {average}")
dddzy1tm

dddzy1tm2#

我建议使用pandas library。它可以更“优雅”地读取.csv文件,并更容易地操作它们。

相关问题