numpy 如何在pandas中遍历列中的列表以查找匹配?

ftf50wuq  于 2023-04-12  发布在  其他
关注(0)|答案(2)|浏览(109)

我有一个术语列表,想查找是否有与特定单词匹配的词
| meta|
| --------------|
| ['Home',' grocery','cake']|
| ['Home',' grocery','Biscuit','Oreo']|
我正在从这个列表中查找匹配项:术语列表=['cake','biscuit']
预期输出:
| meta|B栏|
| --------------|--------------|
| ['Home',' grocery','cake']|真的|
| ['Home',' grocery','Biscuit','Oreo']|真的|

e3bfsja2

e3bfsja21#

您可以使用setintersection

terms = {'cake', 'biscuit'}

df['Column B'] = [bool(set(x)&terms) for x in df['meta']]

如果大小写无关紧要(例如'Biscuit'/'biscuit'),请使用str.lower(或str.casefold)将字符串小写:

df['Column B'] = [bool(set(map(str.lower, x))&terms) for x in df['meta']]

输出:

meta  Column B
0           [Home, grocery, cake]      True
1  [Home, grocery, Biscuit, Oreo]      True
mftmpeh8

mftmpeh82#

将列表转换为集合,并在列表解析中使用set.isdisjoint,将列表的值转换为小写:

terms = ['cake', 'biscuit']
S = set(terms)

df['Column B'] = [not set(y.lower() for y in x).isdisjoint(S) for x in df['meta']]
df['Column B'] = [not set(map(str.lower, x)).isdisjoint(S) for x in df['meta']]
print (df)
                             meta  Column B
0           [Home, grocery, cake]      True
1  [Home, grocery, Biscuit, Oreo]      True

因为不匹配Biscuit

df['Column B'] = [not set(x).isdisjoint(S) for x in df['meta']]
print (df)
                             meta  Column B
0           [Home, grocery, cake]      True
1  [Home, grocery, Biscuit, Oreo]     False

相关问题