我尝试将14个Excel文件中的数据合并到一个数据框中,并将数据框保存为CSV文件。我正在遍历Excel文件,但没有任何数据被合并到单个数据框中。我认为问题出在代码动态查找每个Excel文件中的最后一行。所有我想要合并的数据都在CB:DL列中,从第6行开始,向下大约100k行,所有Excel文件都在不同的行号结束。
下面是我正在测试的代码。
#import modules
import pandas as pd
import glob
from openpyxl import Workbook
from openpyxl import load_workbook as xw
from openpyxl.utils import get_column_letter
# path of the folder
path = r'C:\\All Raw Data\\'
# reading all the excel files
filenames = glob.glob(path + "\\*.xlsx")
# to iterate excel file one by one
# inside the folder
for file in filenames:
print(file)
#print('File names:', filenames)
# initializing empty data frame
finalexcelsheet = pd.DataFrame()
wb = Workbook(file)
print(wb)
for sheet in wb:
ws = wb.sheet["Speech"]
print(ws)
for col in range(1, ws.max_column + 1):
col_letter = get_column_letter(col)
max_col_row = len([cell for cell in ws[col_letter] if cell.value])
print("Column: {}, Row numbers: {}".format(col_letter, max_col_row))
# combining multiple excel worksheets into single data frames
df = pd.concat(pd.read_excel(file, sheet_name=None, header=6, usecols='CB'+max_col_row+':DL'+max_col_row), ignore_index=True, sort=False)
print(df.shape)
# appending excel files one by one
merged= finalexcelsheet.append(df, ignore_index=True)
# to print the combined data
print(merged.shape)
merged.to_csv('C:\\All Raw Data\\merged.csv')
2条答案
按热度按时间n3schb8v1#
我对你的代码做了一点修改,但我想你可以试试这个:
emeijp432#
下面是最终为我工作的代码。感谢巨大的帮助!!没有你的指导和支持,arngrim 280!!