创建pandas dataframe的新列:

jjhzyzn0  于 2023-08-01  发布在  其他
关注(0)|答案(1)|浏览(102)

我有一个pandas数据框架,名为cica_df

Fecha     |Administradora_descrip|Fondo_descrip|PUC_Codigo|PUC_Descrip|Saldo_Cuenta|CRNCY|
--------------------------------------------------------------------------------------------
2007-01-31|Colfondos               |Largo Plazo  |100000  |ACTIVO     |4.517769e+11|COP  |
2007-01-31|Colfondos               |Largo Plazo  |162800  |DEPOSITO   |6.386133e+12|COP  |
2007-01-31|Skandia                 |Largo Plazo  |100000  |ACTIVO     |3.517769e+11|COP  |
2007-01-31|Skandia                 |Largo Plazo  |162800  |DEPOSITO   |7.386133e+12|COP  |
2007-02-28|Colfondos               |Largo Plazo  |100000  |ACTIVO     |4.897769e+11|COP  |
2007-02-28|Colfondos               |Largo Plazo  |162800  |DEPOSITO   |6.986133e+12|COP  |
2007-02-28|Skandia                 |Largo Plazo  |100000  |ACTIVO     |4.907769e+11|COP  |
2007-02-28|Skandia                 |Largo Plazo  |162800  |DEPOSITO   |6.766133e+12|COP  |

字符串
我需要创建一个新列我需要创建一个新列,其值是列“Account_Balance”的值,这样在列“PUC_Code”中,对于列中所有列的值都是100000,在列日期,Admin_descripp,基金_descripp和CRNCY中具有相同值。所以我可以得到一个pandas dataframe,如下所示:

Fecha     |Administradora_descrip|Fondo_descrip|PUC_Codigo|PUC_Descrip|Saldo_Cuenta|CRNCY|Nav        |
-------------------------------------------------------------------------------------------------------
2007-01-31|Colfondos               |Largo Plazo  |100000  |ACTIVO     |4.517769e+11|COP|4.517769e+11 |
2007-01-31|Colfondos               |Largo Plazo  |162800  |DEPOSITO   |6.386133e+12|COP|4.517769e+11 |
2007-01-31|Skandia                 |Largo Plazo  |100000  |ACTIVO     |3.517769e+11|COP|3.517769e+11 |
2007-01-31|Skandia                 |Largo Plazo  |162800  |DEPOSITO   |7.386133e+12|COP|3.517769e+11 |
2007-02-28|Colfondos               |Largo Plazo  |100000  |ACTIVO     |4.897769e+11|COP|4.897769e+11 |
2007-02-28|Colfondos               |Largo Plazo  |162800  |DEPOSITO   |6.986133e+12|COP|4.897769e+11 |
2007-02-28|Skandia                 |Largo Plazo  |100000  |ACTIVO     |4.907769e+11|COP|4.907769e+11 |
2007-02-28|Skandia                 |Largo Plazo  |162800  |DEPOSITO   |6.766133e+12|COP|4.907769e+11 |


我试过这个代码:

cica_df['Nav'] = np.where(cica_df['PUC_Codigo'] == 100000, cica_df['Saldo_Cuenta'], cica_df['Saldo_Cuenta'])


但它只重复了列“Saldo_Cuenta”。我该怎么办?

pb3skfrl

pb3skfrl1#

您可以使用pandas中的groupby函数来实现这一点

import pandas as pd

# Group by the specified columns and calculate the account balance for PUC_Code == 100000
cica_df['Nav'] = cica_df.groupby(['Fecha', 'Administradora_descrip', 'Fondo_descrip', 'CRNCY'])['Saldo_Cuenta'].transform(
    lambda x: x.loc[x['PUC_Codigo'] == 100000].iloc[0])

print(cica_df)

字符串

相关问题