我有一个现有的Python脚本,它目前摄取一个Excel文件。我想调整脚本,以便它摄取四个Excel文件并执行与原始脚本相同的操作,但在最后创建四个单独的数据框-所有数据框都命名为,以便它们与原始输入文件有一些相似之处。
到目前为止,它还没有工作。有关文件名的部分出现错误:file_names = [“file1.xlsx”,“file2.xlsx”,“file3.xlsx”,“file4.xlsx”].
我肯定它是不对的,但我不知道该改什么。
import numpy as np
import pandas as pd
import os
import re
from itertools import product
import glob
import xlwings as xw
import openpyxl as op
data1 = {'Column A':[3, 17, 12, 142],
'Column B':[20, 21, 19, 18],
'Column C':['Year1','Year1','Year1','Year1']}
data2 = {'Column A':[192, 14, 11, 984],
'Column B':[200, 221, 119, 158],
'Column C':['Year2','Year2','Year2','Year2'] }
data3 = {'Column A':[2, 99, 41, 67],
'Column B':[20, 25, 190, 187],
'Column C':['Year3','Year3','Year3','Year3']}
data4 = {'Column 4A':[25, 919, 441, 672],
'Column 4B':[21, 45, 100, 107],
'Column 4C':['Year4','Year4','Year4','Year4']}
df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)
df3 = pd.DataFrame(data3)
df4 = pd.DataFrame(data4)
df4.rename(columns={'Column 4A': 'Column A', 'Column 4B': 'Column B','Column 4C':'Column C'}, inplace=True)
file_names = [df1, df2, df3, df4]
def ingest_excel_files(file_names):
"""Ingests four Excel files and returns a dictionary of data frames."""
data_frames = {}
for file_name in file_names:
data_frames[file_name] = pd.read_excel(file_name)
return data_frames
def perform_operation(data_frames):
for file_name, data_frame in data_frames.items():
numbers = data_frame
numbers_copy_new= numbers.copy()
numbers["New Column"] = numbers["Column A"] + numbers["Column B"]
def create_dataframes(data_frames):
"""Creates data frames for each of the original files."""
for file_name, data_frame in data_frames.items():
df = pd.DataFrame(data_frame)
df.to_excel(f"{file_name}.xlsx")
if __name__ == "__main__":
file_names = ["file1.xlsx", "file2.xlsx", "file3.xlsx", "file4.xlsx"]
data_frames = ingest_excel_files(file_names)
perform_operation(data_frames)
create_dataframes(data_frames)
字符串
我期待/寻找四个新的 Dataframe ,每个 Dataframe 都有原始 Dataframe 的名称和“_new”(我知道我的脚本还没有对此的引用):
df1["New Column"] = df1["Column A"] + df1["Column B"]
print(df1)
df1_new = df1
df2["New Column"] = df2["Column A"] + df2["Column B"]
print(df2)
df2_new = df2
df3["New Column"] = df3["Column A"] + df3["Column B"]
print(df3)
df3_new = df3
df4["New Column"] = df4["Column A"] + df4["Column B"]
print(df4)
df4_new = df4
型
1条答案
按热度按时间i86rm4rw1#
说明
perform_operation
的单独函数的代码。此函数接受单个数据框,添加新列,并返回修改后的数据框。create_dataframes
遍历原始 Dataframe ,对每个 Dataframe 进行操作,并将新的 Dataframe 存储在具有关键字的字典中。data_frames
,其中包含带有适当键的原始 Dataframecreate_dataframes
函数,它返回一个包含新 Dataframe 的字典。字符串
假设: