我们如何通过Map两个列(.map)来匹配两个pandas Dataframe ？

yks3o0rb 于 2023-04-28 发布在其他

关注(0)|答案(2)|浏览(167)

我们如何通过Map两个列（.map）来匹配两个pandas Dataframe ？我尝试使用一个列Map，但这会产生重复。要解决这个问题，应该使用两个列Map来创建唯一的行。
下面是一个具有预期输出的示例：

df1

idx a1 b1   c1 d1 e1
12  4  x10  2  2  5
13  5  x2   3  5  4
14  6  x4   6  9  6
15  7  x13  7  9  2

df2

idx a2  b2 c2 d2 e2
1   x2  x1 2  2  4
2   x8  x2 6  9  8
3   x6  x4 7  9  5
4   x4  x7 6  8  9

通过将c1Map到c2，并将d1Map到d2，e2的值应更新为e1的值，因此df2应如下所示

df2

idx a2  b2 c2 d2 e2
1   x2  x1 2  2  5 (updated)
2   x8  x2 6  9  6 (updated)
3   x6  x4 7  9  2 (updated) 
4   x4  x7 6  8  9

pandas

来源：https://stackoverflow.com/questions/76090665/how-do-we-match-between-two-pandas-dataframes-by-mapping-two-columns-map

2条答案

按热度按时间

yhuiod9q1#

您可以使用merge/combine_first：

lcols, rcols = ["c2", "d2"], ["c1", "d1"] #mappings

df2["e2"] = (df2.merge(df1, left_on=lcols, right_on=rcols, how="left")
                 ["e1"].combine_first(df2["e2"])
            )

输出：

print(df2)

   idx  a2  b2  c2  d2   e2
0    1  x2  x1   2   2  5.0 # <- updated
1    2  x8  x2   6   9  6.0 # <- updated
2    3  x6  x4   7   9  2.0 # <- updated
3    4  x4  x7   6   8  9.0

赞(0）回复(0）举报 2023-04-28

z9zf31ra2#

在Series.fillna中使用两个DataFrame的左连接，并将不匹配的值替换为原始列：

df2['e2'] = (df2.reset_index()
                .merge(df1, left_on=['c2','d2'], right_on=['c1','d1'], how='left')
                .set_index('index')['e1'].fillna(df2['e2']))

如果df2.index中的默认索引：

df2['e2'] = (df2.merge(df1, left_on=['c2','d2'], right_on=['c1','d1'], how='left')['e1']
                .fillna(df2['e2']))

或者使用Index.map by MulitIndex：

d = df1.set_index(['c1','d1'])['e1'].to_dict()

df2['e2'] = (pd.Series(df2.set_index(['c2','d2']).index.map(d).to_numpy(), index=df2.index)
               .fillna(df2['e2']))
print (df2)
   idx  a2  b2  c2  d2   e2
0    1  x2  x1   2   2  5.0
1    2  x8  x2   6   9  6.0
2    3  x6  x4   7   9  2.0
3    4  x4  x7   6   8  9.0

赞(0）回复(0）举报 2023-04-28

我来回答

我们如何通过Map两个列(.map)来匹配两个pandas Dataframe ？

2条答案

相关问题

热门标签

最新问答