pandas Python -如何删除重复列并添加为csv格式的行

mwyxok5s  于 2023-02-02  发布在  Python
关注(0)|答案(2)|浏览(102)

我有以下csv

Name     Date     Qty   Date     Qty   Date     Qty
---------------------------------------------------
ABC       Jan 2023   10    Feb 2023    11    Mar 2023    12
XYZ       Jan 2023   20    Feb 2023    21    Mar 2023    22

我希望csv/dataframe中的输出如下所示

Name     Date     Qty
---------------------
ABC       Jan 2023   10
ABC       Feb 2023   11
ABC       Mar 2023   12
XYZ       Jan 2023   20
XYZ       Feb 2023   21
XYZ       Mar 2023   22

我如何达到这个结果?

x4shl7ld

x4shl7ld1#

有点复杂,但可以完成这项工作。您可以一步一步地执行以查看转换:

>>> (df.melt('Name').assign(row=lambda x: x.groupby('variable').cumcount())
       .pivot(['row', 'Name'], 'variable', 'value')
       .reset_index('Name').rename_axis(index=None, columns=None))

  Name      Date Qty
0  ABC  Jan 2023  10
1  XYZ  Jan 2023  20
2  ABC  Feb 2023  11
3  XYZ  Feb 2023  21
4  ABC  Mar 2023  12
5  XYZ  Mar 2023  22
tct7dpnv

tct7dpnv2#

与@ Corralian的解决方案相比,简化程度较低。还使用了融化和枢轴。

import pandas as pd
import io

#-----------------------------------------------#
#Recreate OP's table with duplicate column names#
#-----------------------------------------------#
df = pd.read_csv(io.StringIO("""
ABC       Jan-2023   10    Feb-2023    11    Mar-2023    12
XYZ       Jan-2023   20    Feb-2023    21    Mar-2023    22
"""),header=None,delim_whitespace=True)

df.columns = ['Name','Date','Qty','Date','Qty','Date','Qty']

#-----------------#
#Start of solution#
#-----------------#
#melt from wide to long (maintains order)
melted_df = df.melt(
    id_vars='Name',
    var_name='col',
    value_name='val',
)

#add a number for Date1/Date2/Date3 to keep track of Qty1/Qty2/Qty3 etc
melted_df['col_number'] = melted_df.groupby(['Name','col']).cumcount()

#pivot back to wide form
wide_df = melted_df.pivot(
    index=['Name','col_number'],
    columns='col',
    values='val',
).reset_index().drop(columns=['col_number'])

wide_df.columns.name = None #remove column index name

#Final output
print(wide_df)

产出

Name      Date Qty
0  ABC  Jan-2023  10
1  ABC  Feb-2023  11
2  ABC  Mar-2023  12
3  XYZ  Jan-2023  20
4  XYZ  Feb-2023  21
5  XYZ  Mar-2023  22

相关问题