pandas 类型错误:无法对dtyped [float64]数组和[bool]类型的标量执行“兰德_”

apeeds0o  于 2022-12-31  发布在  其他
关注(0)|答案(3)|浏览(367)

我在python panda中运行了一个命令,如下所示:

q1_fisher_r[(q1_fisher_r['TP53']==1) & q1_fisher_r[(q1_fisher_r['TumorST'].str.contains(':1:'))]]

出现以下错误:

TypeError: Cannot perform 'rand_' with a dtyped [float64] array and scalar of type [bool]

我尝试使用的解决方案是:error link.
相应地将代码更改为:

q1_fisher_r[(q1_fisher_r['TumorST'].str.contains(':1:')) & (q1_fisher_r[(q1_fisher_r['TP53']==1)])]

但我还是得到了与TypeError: Cannot perform 'rand_' with a dtyped [float64] array and scalar of type [bool]相同的错误

0md85ypi

0md85ypi1#

对于按多个条件筛选,请按&链接它们并按boolean indexing筛选:

q1_fisher_r[(q1_fisher_r['TP53']==1) & q1_fisher_r['TumorST'].str.contains(':1:')]
               ^^^^                        ^^^^
           first condition            second condition

问题是此代码返回了筛选的数据,因此无法按条件链接:

q1_fisher_r[(q1_fisher_r['TumorST'].str.contains(':1:'))]

类似问题:

q1_fisher_r[(q1_fisher_r['TP53']==1)]

样品

q1_fisher_r = pd.DataFrame({'TP53':[1,1,2,1], 'TumorST':['5:1:','9:1:','5:1:','6:1']})
print (q1_fisher_r)
   TP53 TumorST
0     1    5:1:
1     1    9:1:
2     2    5:1:
3     1     6:1

df = q1_fisher_r[(q1_fisher_r['TP53']==1) & q1_fisher_r['TumorST'].str.contains(':1:')]
print (df)
   TP53 TumorST
0     1    5:1:
1     1    9:1:
1zmg4dgp

1zmg4dgp2#

有一个类似的问题与设置如下,这产生了相同的错误消息。对我来说非常简单的解决方案是有每个单独的条件之间的括号。应该知道,但要突出显示的情况下,其他人也有同样的问题。
错误代码:

conditions = [
    (df['A'] == '15min' & df['B'].dt.minute == 15),  # Note brackets only surrounding both conditions together, not each individual condition
    df['A'] == '30min' & df['B'].dt.minute == 30,  # Note no brackets at all 
]

output = [
    df['Time'] + dt.timedelta(minutes = 45),
    df['Time'] + dt.timedelta(minutes = 30),
]

df['TimeAdjusted'] = np.select(conditions, output, default = np.datetime64('NaT'))

正确代码:

conditions = [
        (df['A'] == '15min') & (df['B'].dt.minute == 15),  # Note brackets surrounding each condition
        (df['A'] == '30min') & (df['B'].dt.minute == 30),  # Note brackets surrounding each condition
]
    
output = [
        df['Time'] + dt.timedelta(minutes = 45),
        df['Time'] + dt.timedelta(minutes = 30),
]

df['TimeAdjusted'] = np.select(conditions, output, default = np.datetime64('NaT'))
zpf6vheq

zpf6vheq3#

简单的解决方案-满足括号内的所有条件-
这会引发错误-

df[df['Customer Id']==999 & df['month_year']=='10/22']

这不是-

df[(df['Customer Id']==999) & (df['month_year']=='10/22')]

相关问题