Pandas Concat与追加和连接列-->(“state”,“state:“,“State”)

iezvtpos  于 2023-02-06  发布在  其他
关注(0)|答案(1)|浏览(101)

我连接了437个表,得到了3列状态,因为我的同事每天都想给它一个不同的名称,("状态","状态:"和"状态"),有没有办法将这3列连接到一个名为"状态"的列?
*also my code uses append, I just saw its deprecated, will it work the same using concat? any way to make it give the same results as append?.
我试过:

excl_merged.rename(columns={"state:": "state", "State": "state"})

但它什么也做不了。
我使用的代码:

# importing the required modules
import glob
import pandas as pd
 
# specifying the path to csv files
path = "X:/.../Admission_merge"
 
# csv files in the path
file_list = glob.glob(path + "/*.xlsx")
 
# list of excel files we want to merge.
# pd.read_excel(file_path) reads the excel
# data into pandas dataframe.
excl_list = []
 
for file in file_list:
    excl_list.append(pd.read_excel(file)) #use .concat will it give the columns in the same order? 
 
# create a new dataframe to store the
# merged excel file.
excl_merged = pd.DataFrame()
 
for excl_file in excl_list:
     
    # appends the data into the excl_merged
    # dataframe.
    excl_merged = excl_merged.append(
      excl_file, ignore_index=True)

# exports the dataframe into excel file with
# specified name.
excl_merged.to_excel('X:/.../Admission_MERGED/total_admission_2021-2023.xlsx', index=False)
print("Merge finished")

有什么建议,我可以如何改进它?也有没有办法删除未命名的空列?
多谢了。

p5cysglq

p5cysglq1#

您可以使用pd.concat

excl_list = ['state1.xlsx', 'state2.xlsx', 'state3.xlsx']
state_map = {'state:': 'state', 'State': 'state'}

data = []
for excl_file in excl_list:
    df = pd.read_excel(excl_file)

    # Case where first row is empty
    if df.columns[0].startswith('Unnamed'):
        df.columns = df.iloc[0]
        df = df.iloc[1:]

    df = df.rename(columns=state_map)
    data.append(df)
excl_merged = pd.concat(data, ignore_index=True)

# Output
  ID state
0  A     a
1  B     b
2  C     c
3  D     d
4  E     e
5  F     f
6  G     g
7  H     h
8  I     i

file1.xlsx:

ID State
0  A     a
1  B     b
2  C     c

file2.xlsx:

ID state
0  D     d
1  E     e
2  F     f

file3.xlsx:

ID state:
0  G      g
1  H      h
2  I      i

如果有空列,可以在追加到数据列表之前使用data.append(df.dropna(how='all', axis=1))

相关问题