python-3.x 合并具有一对多关系的两个 Dataframe

11dmarpk  于 2023-04-22  发布在  Python
关注(0)|答案(1)|浏览(165)

我想合并两个dataframe像下面,但没有得到什么功能使用这一点。我试图分解问题,但没有得到正确的解决方案-

import pandas as pd

person_data = [{'Id': 3763058, 'Name': 'Andi', 'description': 'abc'},
               {'Id': 3763077, 'Name': 'Mark', 'description': 'xyz'}]

person_df1 = pd.DataFrame(person_data)
display(person_df1)
身份证姓名描述
3763058安迪ABC
3763077马克xyz
object_data = [{'Id': 3763058, 'object_name': 'PlayStation', 'object_count': 2},
               {'Id': 3763077, 'object_name': 'MathsBook',   'object_count': 1},
               {'Id': 3763058, 'object_name': 'MusicSystem', 'object_count': 3},
              ]

object_df2 = pd.DataFrame(object_data)
display(object_df2)
身份证对象名目标计数
3763058PlayStation
3763077MathsBook1
3763058音乐系统

结果DF -
| 身份证|姓名|描述|PlayStation|MathsBook|音乐系统|
| --------------|--------------|--------------|--------------|--------------|--------------|
| 3763058|安迪|ABC|二|0|三|
| 3763077|马克|xyz|0|1|0|
我试着把问题-
part-1:获取object_name的唯一值

# uniqe_object_name = object_df2['object_name'].unique().tolist()

new_cols= ['PlayStation', 'MathsBook', 'MusicSystem'] # As of now fix value we have
new_vals = [0,0,0]

part-2:创建唯一object_name列并初始化为零

person_df1 = person_df1.reindex(columns=person_df1.columns.tolist() + new_cols)
person_df1[new_cols] = new_vals
print(person_df1)

part 3:按id分组并将object_count的值存储到object_name列Stuck here,not getting what func to use to create a column from other df and assign value from other df columns.

person_df1['id'][object_name] = object_df2.groupby('id')['object_name'].apply(', '.join).reset_index()
56lgkhnf

56lgkhnf1#

首先应用pivot_table,然后应用merge

out = person_df1.merge(object_df2.pivot_table(index='Id', columns='object_name',
                                              values='object_count', fill_value=0
                                             ).reset_index())

或者,在合并的键和类型上显式地:

out = person_df1.merge(object_df2.pivot_table(index='Id', columns='object_name',
                                              values='object_count', fill_value=0
                                             ),
                      left_on='Id', right_index=True, how='left'
                      )

输出:

Id  Name description  MathsBook  MusicSystem  PlayStation
0  3763058  Andi         abc          0            3            2
1  3763077  Mark         xyz          1            0            0

如果对象的顺序很重要:

out = person_df1.merge(object_df2.pivot_table(index='Id', columns='object_name',
                                              values='object_count', fill_value=0
                                             )[object_df2['object_name'].unique()],
                      left_on='Id', right_index=True, how='left'
                      )

输出:

Id  Name description  PlayStation  MathsBook  MusicSystem
0  3763058  Andi         abc            2          0            3
1  3763077  Mark         xyz            0          1            0

相关问题