pandas 如果DF2中的列表在DF1中的列表中,则在DF1中填充新列,DF1中可能有多个更新,因此需要结果在列表中

whhtz7ly  于 2023-04-18  发布在  其他
关注(0)|答案(2)|浏览(180)
DF1:

NAME   LIST_1
BILL   ['ZF1', 'ZF2', 'ZF3', 'ZF9', 'ZF11']
PAUL   ['ZF1', 'ZF4', 'ZF5', 'ZF2', 'ZF3']
JOHN   ['ZF1', 'ZF2', 'ZF5', 'ZF6']

DF2:

ID     LIST_2
ZB1    ['ZF1', 'ZF2', 'ZF3']
ZB2    ['ZF1', 'ZF4', 'ZF5']
ZB3    ['ZF2', 'ZF5', 'ZF6']

NEEDED RESULT:

DF1 (Can also be a new DF):

NAME   LIST_1                                 MATCH 
BILL   ['ZF1', 'ZF2', 'ZF3', 'ZF9', 'ZF11']   ['ZB1']
PAUL   ['ZF1', 'ZF4', 'ZF5', 'ZF2', 'ZF3']    ['ZB1', 'ZB2']
JOHN   ['ZF1', 'ZF2', 'ZF5', 'ZF6']           ['ZB3']

我还没有真正尝试过很多东西,因为我对列表比较感到困惑。我希望我需要分解DF1和DF2并比较和使用合并?任何帮助都将不胜感激。

yduiuuwa

yduiuuwa1#

尝试:

df1['MATCH'] = df1.apply(lambda x: [i for i, l in zip(df2.ID, df2.LIST_2) if all(v in x['LIST_1'] for v in l)] , axis=1)

print(df1)

图纸:

NAME                      LIST_1       MATCH
0  BILL  [ZF1, ZF2, ZF3, ZF9, ZF11]       [ZB1]
1  PAUL   [ZF1, ZF4, ZF5, ZF2, ZF3]  [ZB1, ZB2]
2  JOHN        [ZF1, ZF2, ZF5, ZF6]       [ZB3]
  • 可选 *:如果LIST_1/LIST_2列中的值是字符串,则将它们转换为列表:
from ast import literal_eval

df1.LIST_1 = df1.LIST_1.apply(literal_eval)
df2.LIST_2 = df2.LIST_2.apply(literal_eval)
ergxz8rk

ergxz8rk2#

在pandas中处理Series列表总是很棘手。
最合理的方法可能是使用集合运算:

lst = list(zip(df2['LIST_2'].apply(frozenset), df2['ID']))

df1['MATCH'] = [[ID for s, ID in lst if s.issubset(l)]
                for l in df1['LIST_1']]

输出:

NAME                      LIST_1       MATCH
0  BILL  [ZF1, ZF2, ZF3, ZF9, ZF11]       [ZB1]
1  PAUL   [ZF1, ZF4, ZF5, ZF2, ZF3]  [ZB1, ZB2]
2  JOHN        [ZF1, ZF2, ZF5, ZF6]       [ZB3]

相关问题