在Pandas中使用df.loc更改arthimatic序列的列值

tjjdgumg  于 2023-02-02  发布在  其他
关注(0)|答案(3)|浏览(155)

假设我有以下 Dataframe :

data = {"age":[2,3,2,5,9,12,20,43,55,60],'alpha' : [0,0,0,0,0,0,0,0,0,0]}
df = pd.DataFrame(data)

我想使用df.loc和一个算术序列,基于age列更改alpha列的值,但出现语法错误:

df.loc[((df.age <=4)) , "alpha"] = ".4"          
df.loc[((df.age >= 5)) & ((df.age <= 20)), "alpha"] = 0.4 + (1 - 0.4)*((df$age - 4)/(20 - 4))
df.loc[((df.age > 20)) , "alpha"] = "1"

谢谢你在davance

tv6aics1

tv6aics11#

使用.而不是$引用age

df.loc[((df.age >= 5)) & ((df.age <= 20)), "alpha"] = 0.4 + (1 - 0.4)*((df.age - 4)/(20 - 4))
nmpmafwu

nmpmafwu2#

您可以使用链式np.where子句一次组合所有条件,而不是多次.loc赋值:

df['alpha'] = np.where(df.age <= 4, ".4", np.where((df.age >= 5) & (df.age <= 20),
                                         0.4 + (1 - 0.4) *((df.age - 4)/(20 - 4)),
                                         np.where(df.age > 20, "1", df.alpha)))
print(df)
age   alpha
0    2      .4
1    3      .4
2    2      .4
3    5  0.4375
4    9  0.5875
5   12     0.7
6   20     1.0
7   43       1
8   55       1
9   60       1
b4qexyjb

b4qexyjb3#

除了synthax error(* 由于$*),为了减少可见噪声,我会选择numpy.select

import numpy as np
​
conditions = [df["age"].le(4),
              df["age"].gt(4) & df["age"].le(20),
              df["age"].gt(20)]
​
values = [".4", 0.4 + (1 - 0.4) * ((df["age"] - 4) / (20 - 4)), 1]
​
df["alpha"] = np.select(condlist= conditions, choicelist= values)

输出:

print(df)

   age   alpha
0    2      .4
1    3      .4
2    2      .4
3    5  0.4375
4    9  0.5875
5   12     0.7
6   20     1.0
7   43       1
8   55       1
9   60       1

相关问题