这与另一个查询(pandas loop through a list of dataframes (df) to do calculation, results in previous df to be referenced and used for next df)有关,而关于循环一个数组(dfs)列表的更具体的细节,计算是两个连续dfs之间的依赖关系。
计算要求:DFS中的i_f、r_f1、r_f2和r_f3,i_f与阈值0.95比较(获取大于或等于的ID),三个r_fs比较为0.3(得到小于或等于的ID),四个比较结果一起确定第一个df中的合格ID,在下一个df中,做同样的事情,i_f与0.95的相同阈值进行比较,然而对于三个r_fs,来自先前DF的合格ID与0.3333比较,其他的与0.3比较,再次一起确定该DF中的合格ID,对于列表中的其余DF,依此类推。
下面是df列表示例,所有ID的df1中的预期输出为1,0,1,所有ID的df2中的预期输出为1,0,1,1,1。
df1 = pd.DataFrame({'ID':[1,2,3],
'i_f':[0.967385562,0.869575345,1],
'r_f1':[0.18878187,0.327355797,0.100753051],
'r_f2':[0.047237449,0.056038276,0.189434048],
'r_f3':[0.095283998,0.2554309,0.138240321]})
df2 = pd.DataFrame({'ID':[1,2,3,4,5],
'i_f':[0.985,1,0.993297332,1,1],
'r_f1':[0.300009355,0.331788473,0.146077926,0.167329833,0.245227094],
'r_f2':[0.152293038,0.06668,0.196683885,0.101269411,0.02493159],
'r_f3':[0.111617815,0.042016,0.175285158,0.085330897,0.238370325]})
df_lst = [df1, df2]`
字符串
1条答案
按热度按时间sqyvllje1#
逻辑与你之前的问题相同。阈值
i_f
是恒定的,所以这不是一个问题,不像r_f*
,必须在每次迭代中计算。然而,阈值对于所有r_f*
都是相同的:字符串