我试图将新的DataFrame列设置为现有DataFrame的简单计算,但当我运行脚本时,我收到了来自Pandas的警告。
data_join['Ele_total'] = data_ele.sum(axis=1)
data_join['PV_total'] = data_pv.sum(axis=1)
data_join['SC'] = np.where(data_join['PV_total']>data_join['Ele_total'], data_join['Ele_total'], data_join['PV_total'])
data_join['SC%'] = np.where(data_join['PV_total']!= 0,round((data_join['SC']/data_join['PV_total'])*100,0),0)
data_join['SS%'] = np.where(data_join['Ele_total']!= 0,round((data_join['SC']/data_join['Ele_total'])*100,0),0)
data_join['LOLP'] = data_join['Ele_total']>data_join['PV_total']
data_join['E_tg'] = data_join['PV_total']-data_join['SC']
data_join['E_fg'] = data_join['Ele_total']-data_join['SC']
data_join['Ei'] = data_join['E_tg']-data_join['E_fg']
data_join['NGIP'] = data_join['Ei'].abs()<(GRID_LIM*n_build)
data_join['PAL'] = data_join['Ei'].abs()>(PEAK_LIM*n_build)
data_join['CO2'] = data_CO2['GWP']
data_join['CO2_net'] = data_CO2['GWP']*data_join['SC']
data_join['CO2_tot'] = data_CO2['GWP']*(data_join['E_tg']+data_join['SC'])
cash_flow = 0
npv = []
data_join_npv = pd.DataFrame()
for i in range (0,25):
if i == 0:
data_join_npv['PV_total_res_{}'.format(i)] = data_join_res['PV_total']
data_join_npv['PV_total_ind_{}'.format(i)] = data_join_ind['PV_total']
else:
data_join_npv['PV_total_res_{}'.format(i)] = data_join_npv['PV_total_res_{}'.format(i-1)]*(1-d)
data_join_npv['PV_total_ind_{}'.format(i)] = data_join_npv['PV_total_ind_{}'.format(i-1)]*(1-d)
data_join_npv['SC_res_{}'.format(i)] = np.where(data_join_npv['PV_total_res_{}'.format(i)]>data_join_res['Ele_total'], data_join_res['Ele_total'], data_join_npv['PV_total_res_{}'.format(i)])
data_join_npv['SC_ind_{}'.format(i)] = np.where(data_join_npv['PV_total_ind_{}'.format(i)]>data_join_ind['Ele_total'], data_join_ind['Ele_total'], data_join_npv['PV_total_ind_{}'.format(i)])
data_join_npv['E_tg_res_{}'.format(i)] = data_join_npv['PV_total_res_{}'.format(i)]-data_join_npv['SC_res_{}'.format(i)]
data_join_npv['E_tg_ind_{}'.format(i)] = data_join_npv['PV_total_ind_{}'.format(i)]-data_join_npv['SC_ind_{}'.format(i)]
data_join_npv['E_fg_res_{}'.format(i)] = data_join_res['Ele_total']-data_join_npv['SC_res_{}'.format(i)]
data_join_npv['E_fg_ind_{}'.format(i)] = data_join_ind['Ele_total']-data_join_npv['SC_ind_{}'.format(i)]
cash = float(data_join_npv['SC_res_{}'.format(i)].sum())*COST_OF_ENERGY_RES + float(data_join_npv['E_tg_res_{}'.format(i)].sum())*VALUE_OF_ENERGY - float(data_join_npv['E_fg_res_{}'.format(i)].sum())*COST_OF_ENERGY_RES + float(data_join_npv['SC_ind_{}'.format(i)].sum())*COST_OF_ENERGY_IND + float(data_join_npv['E_tg_ind_{}'.format(i)].sum())*VALUE_OF_ENERGY - float(data_join_npv['E_fg_ind_{}'.format(i)].sum())*COST_OF_ENERGY_IND - OM_COST*total_pv
cash_flow += cash/((1+DISC_RATE)**(i+1))
npv.append(-in_inv+cash_flow)
这些是我得到的警告:
C:\Users\Giacomo\Desktop\150\insert_data.py:342:性能警告:DataFrame高度碎片化。这通常是多次调用frame.insert
的结果,性能很差。请考虑使用pd.concat一次性连接所有列。为了得到经去碎片化的帧,use newframe = frame.copy()
data_join_npv['E_tg_res_{}'. format(i)] = data_join_npv['PV_total_res_{}'. format(i)]-data_join_npv['SC_res_{}'. format(i)] C:\Users\Giacomo\Desktop\150\insert_data.py:343:性能警告:DataFrame高度碎片化。这通常是多次调用frame.insert
的结果,性能很差。考虑使用pd.concat(axis=1)一次连接所有列。要获得碎片化的帧,use newframe = frame.copy()
data_join_npv['E_tg_ind_{}'.format(i)] = data_join_npv['PV_total_ind_{}'.format(i)]-data_join_npv['SC_ind_{}'.format(i)] C:\Users\Giacomo\Desktop\150\insert_data.py:344:性能警告:DataFrame高度碎片化。这通常是多次调用frame.insert
的结果,其性能很差。请考虑使用pd.concat(axis=1)一次连接所有列。要获得碎片化帧,请使用newframe = frame.copy()
data_join_npv ['E_fg_res_{}'. format(i)] = data_join_res ['Ele_total']-data_join_npv ['SC_res_{}'. format(i)] C:\Users\Giacomo\Desktop\150\insert_data.py:345:性能警告:DataFrame高度碎片化。这通常是多次调用frame.insert
的结果,性能很差。请考虑使用pd.concat(axis=1)一次连接所有列。要获得碎片化的帧,请使用newframe = frame.copy()
data_join_npv ['E_fg_ind_{}'. format(i)] = data_join_ind ['Ele_total']-data_join_npv ['SC_ind_{}'. format(i)] C:\Users\Giacomo\Desktop\150\insert_data.py:337:}:性能警告:DataFrame高度碎片化。这通常是多次调用frame.insert
的结果,性能较差。请考虑使用pd.concat(axis=1)一次连接所有列。要获得碎片化帧,请使用newframe = frame.copy()
data_join_npv ['PV_total_res_{}'. format(i)] = data_join_npv ['PV_total_res_{}'. format(i-1)](1-d)C:\Users\Giacomo\Desktop\150\insert_data.py:338:性能警告:DataFrame高度碎片化。这通常是多次调用frame.insert
的结果,性能较差。请考虑使用pd.concat(axis=1)一次连接所有列。要获得碎片化的帧,请使用newframe = frame.copy()
data_join_npv ['PV_total_ind_{}'. format(i)] = data_join_npv ['PV_total_ind_{}'. format(i-1)](1-d)C:\Users\Giacomo\Desktop\150\insert_data.py:340:性能警告:DataFrame高度碎片化。这通常是多次调用frame.insert
的结果,性能较差。请考虑使用pd.concat(axis=1)一次连接所有列。要获得碎片化的帧,请使用newframe = frame.copy()
data_join_npv ['SC_res_{}'. format(i)] = np.where(data_join_npv ['PV_total_res_{}'. format(i)]〉data_join_res ['Ele_total'],data_join_res['Ele_total'],data_join_npv['PV_total_res_{}'.format(i)])C:\Users\Giacomo\Desktop\150\insert_data.py:341:性能警告:DataFrame高度碎片化。这通常是多次调用frame.insert
的结果,性能较差。请考虑使用pd.concat(axis=1)一次连接所有列。要获得碎片化的帧,请使用newframe = frame.copy()
data_join_npv ['SC_ind_{}'. format(i)] = np.where(data_join_npv ['PV_total_ind_{}'. format(i)]〉data_join_ind ['Ele_total'],data_join_ind['Ele_total'],data_join_npv['PV_total_ind_{}'.format(i)])C:\Users\Giacomo\Desktop\150\insert_data.py:342:性能警告:DataFrame高度碎片化。这通常是多次调用frame.insert
的结果,性能很差。请考虑使用pd.concat(axis=1)一次连接所有列。若要获得碎片整理后的帧,请使用newframe = frame.copy()
我没有像警告所建议的那样使用frame.insert(),所以我不明白为什么我会有这个关于碎片的警告。我得到了正确的结果,但由于我必须在优化器中多次运行代码,我认为我得到的大量警告在分析过程中的某个时候停止了优化器,我想解决它们。
1条答案
按热度按时间s4n0splo1#
你会得到这些多个警告,因为你反复地插入列到你的 Dataframe
data_join_npv
中,而不是在for循环之后和之外连接它们,这是更有效的内存方式。例如,运行以下玩具代码:
您将得到以下输出:
性能警告:DataFrame高度碎片化。这通常是多次调用
frame.insert
的结果,性能较差。请考虑使用pd.concat(axis=1)一次连接所有列。要获得经过碎片整理的帧,请使用newframe = frame.copy()
new_df[f“new_df_col{i}"] = df[f“col{i}"]+i然而,例如,初始化空字典而不是 Dataframe ,并使用Pandas concat:
你会得到相同的dataframe,没有任何警告:
所以,试着像这样重构你的代码: