将if else函数应用于python/python中的两个字符串列

ltskdhd1  于 2021-07-13  发布在  Java
关注(0)|答案(4)|浏览(228)

在我的Dataframe中,我有:

Name    Sex    Height
Jackie   F       Small
John     M       Tall

我使用了以下函数来创建基于组合的新列:

def genderfunc(x,y):
    if x =='Tall' & y=='M':
        return 'T Male'
    elif x =='Medium' & y=='M':
        return 'Male'
    elif x =='Small' & y=='M':
        return 'Male'
    elif x =='Tall' & y=='F':
        return 'T Female'
    elif x =='Medium' & y=='F':
        return 'Female'
    elif x =='Small' & y=='F':
        return 'Female'
    else:
        return y

应用此函数的代码行:

df['GenderDetails'] = df.apply(genderfunc(df['Height'],df['Sex']))

我得到以下信息:
typeerror:无法对dtyped[object]数组和[bool]类型的标量执行“rand\”
你知道我做错了什么吗?这是我第一次尝试使用函数。
谢谢!

b4lqfgs4

b4lqfgs41#

你很接近,需要lambda函数 axis=1 因为标量处理使用 and :

def genderfunc(x,y):
    if x =='Tall' and y=='M':
        return 'T Male'
    elif x =='Medium' and y=='M':
        return 'Male'
    elif x =='Small' and y=='M':
        return 'Male'
    elif x =='Tall' and y=='F':
        return 'T Female'
    elif x =='Medium' and y=='F':
        return 'Female'
    elif x =='Small' and y=='F':
        return 'Female'
    else:
        return y

df['GenderDetails'] = df.apply(lambda x: genderfunc(x['Height'],x['Sex']), axis=1)
print (df)
     Name Sex Height GenderDetails
0  Jackie   F  Small        Female
1    John   M   Tall        T Male

使用helper dataframe和left join可以实现非循环解决方案:


# set values like need

L = [('Tall','M','T Male'), ('Small','F','Female')]
df1 = pd.DataFrame(L, columns=['Height','Sex','GenderDetails'])

df = df.merge(df1, how='left')
print (df)
     Name Sex Height GenderDetails
0  Jackie   F  Small        Female
1    John   M   Tall        T Male
xsuvu9jc

xsuvu9jc2#

这是另一种方法,使用 map .

map_ = {"TallM": "T Male", "SmallF": "Female"}

df['GenderDetails'] = (df['Height'] + df['Sex']).str.strip().map(map_)
Name Sex Height GenderDetails
0  Jackie   F  Small        Female
1    John   M   Tall        T Male
edqdpe6u

edqdpe6u3#

如果性能是一个问题,也可以使用np.select-

condlist = [(df['Height'] == 'Tall') & (df['Sex'] == 'M'),
            (df['Height'] == 'Medium') & (df['Sex'] == 'M'),
            (df['Height'] == 'Small') & (df['Sex'] == 'M'),
            (df['Height'] == 'Tall') & (df['Sex'] == 'F'),
            (df['Height'] == 'Medium') & (df['Sex'] == 'F'),
            (df['Height'] == 'Small') & (df['Sex'] == 'F')]
choiselist = [
    'T Male',
    'Male',
    'Male',
    'T Female',
    'Female',
    'Female'
]

df['GenderDetails'] = np.select(condlist, choiselist, df['Sex'])
lrpiutwd

lrpiutwd4#

你需要替换 &and .

相关问题