我试着写两个for循环,为不同的输入返回一个分数,然后用新的分数创建一个新的字段,第一个循环运行良好,但是第二个循环从来没有返回正确的分数。
import pandas as pd
d = {'a':['foo','bar'], 'b':[1,3]}
df = pd.DataFrame(d)
score1 = df.loc[df['a'] == 'foo']
score2 = df.loc[df['a'] == 'bar']
for i in score1['b']:
if i < 3:
score1['c'] = 0
elif i <= 3 and i < 4:
score1['c'] = 1
elif i >= 4 and i < 5:
score1['c'] = 2
elif i >= 5 and i < 8:
score1['c'] = 3
elif i == 8:
score1['c'] = 4
for j in score2['b']:
if j < 2:
score2['c'] = 0
elif j <= 2 and i < 4:
score2['c'] = 1
elif j >= 4 and i < 6:
score2['c'] = 2
elif j >= 6 and i < 8:
score2['c'] = 3
elif j == 8:
score2['c'] = 4
print(score1)
print(score2)
当我运行脚本时,它返回以下内容:
print(score1)
a b c
0 foo 1 0
print(score2)
a b
1 bar 3
为什么score2不创建新字段“c”或一个分数?
2条答案
按热度按时间nimxete21#
避免使用
for
循环来有条件地更新DataFrame中不是Python列表的列。使用Pandas和Numpy的矢量化方法,比如numpy.select
,它可以扩展到数百万行!记住这些数据科学工具的计算方式与一般使用的Python有很大的不同:vsnjm48y2#
在第二个
for
循环的第一次迭代中,j
将在3中,因此您的条件不满足。