我有以下格式的 Dataframe (实际 Dataframe 包含10000多行)
Occupation Education
Engineer High School
Neurosurgeon Masters
Electrical Engineer Masters
Mechanical Engineer Masters
Software Engineer Masters
Engineer Masters
Business Executive Masters
Sales Executive Bachelors
Neurosurgeon Masters
Electrical Engineer
Accountant Bachelors
Sales Executive Masters
我想添加基于选择性筛选的列
我需要我的结果是这样的
Occupation Education Welfare_Cost
Engineer High School 50
Neurosurgeon Masters 50
Electrical Engineer Masters 100
Mechanical Engineer Masters 100
Software Engineer Masters 100
Engineer Masters 100
Business Executive Masters 100
Sales Executive Bachelors 50
Neurosurgeon Masters 50
Electrical Engineer 50
Accountant Bachelors 50
Sales Executive Masters 100
我只想处理职业包含列表中的字符串和Education is Masters的行,我尝试使用下面的代码来实现这一点,但一直得到错误。
lis=['Engineer','Executive','Teacher']
df['Welfare_Cost']=np.where(((df['Education']=='Masters')&
(df['Occupation'].str.contains(i for i in lis))),
100,50)
我知道我也可以通过运行迭代循环来为每一行创建一个列表,并将该列表添加为一列,但我有许多列表组合,因此我正在寻找一种不使用迭代循环就能完成此操作的方法。
2条答案
按热度按时间332nm8kg1#
将
join
与\b\b
一起用于单词边界,将|
用于正则表达式or
:或者:
可能存在差异,参见变更数据-
\b\b
匹配字符串,无子字符串:lymnna712#
在您的示例中,过滤器列表只包含职业名称的基本部分(语义上),因此检查
str.endswith
就足够了: