python 如何在一行中匹配超过2个数字?

eyh26e7m  于 2022-12-21  发布在  Python
关注(0)|答案(1)|浏览(146)

我有2个 Dataframe ,我想匹配2个以上的数字,匹配的行,我正在寻找。

import pandas as pd 

cols = ['Num1','Num2','Num3','Num4','Num5','Num6']
df1 = pd.DataFrame([[2,4,6,8,9,10]], columns=cols)

df2 = pd.DataFrame([[1,1,2,4,5,6,8],
               [2,5,6,20,22,23,34],
               [3,8,12,13,34,45,46],
               [4,9,10,14,29,32,33],
               [5,1,22,13,23,33,35],
               [6,1,6,7,8,9,10],
               [7,0,2,3,5,6,8]], 
               columns = ['Id','Num1','Num2','Num3','Num4','Num5','Num6'])

我有这个匹配的代码,但我想通过匹配行中超过2个数字来增强。

# convert the values in the first dataframe to a list
   vals_to_find = df1.iloc[0].tolist()

 # Print the values to find
   print("Vals to find:", vals_to_find)

 # Create an empty list to hold the matching IDs
   matching_ids = []

# iterate through the big dataframe 
  for index, row in df2.iterrows():

  rowlist = row.tolist()       # convert the row to a list

# keep the id for later, and extract the other values for evaluation
  id = rowlist[0]
  vals = rowlist[1:]

# count the number of values in one list against another list
counter = sum(elem in vals_to_find for elem in vals)

# If the number of matches is greater than 2, then grab the ID
if counter > 2:
    matching_ids.append({'ID': id})

# Print the matching IDs 
  print('Matching IDS:', matching_ids)

我希望我的结果是这样的。

df3 = pd.DataFrame([[6,1,6,7,8,9,10],
               [7,0,2,3,5,6,8]], 
               columns = ['Id', 'Num1','Num2','Num3','Num4','Num5','Num6'])
zzlelutf

zzlelutf1#

我希望我没理解错你的问题,你可以构造一个掩码(使用set.intersection),然后在df2上使用这个掩码:

vals_to_find = set(df1.iloc[0])
mask = df2.loc[:, "Num1":].apply(
    lambda x: len(vals_to_find.intersection(x)) > 2, axis=1
)
print(df2[mask])

图纸:

Id  Num1  Num2  Num3  Num4  Num5  Num6
0   1     1     2     4     5     6     8
5   6     1     6     7     8     9    10
6   7     0     2     3     5     6     8

相关问题