pandas cumsum()是否有办法将NaN处理为零?

d7v8vwbk  于 2023-04-28  发布在  其他
关注(0)|答案(1)|浏览(89)

我有一个pandas dataframe df,在一个列中有一些NaN值,我试图计算其累积和。而不是用0预填充NaN,有没有办法让cumsum忽略NaN?

bjg7j2ky

bjg7j2ky1#

def notnull_index(col: str) -> list:
    '''
    Returns list of indices where column value is not NaN
    '''
    return list(df[df[col].notnull()==True].index)

def cumsum(col: str) -> pd.DataFrame:
    '''
    Returns the cumsum of the column for non-NaN values
    '''
    # List of indices where the rows are not null
    idx = notnull_index(col)
    # Filters DataFrame on non-null rows
    df_notnull = df[df.index.isin(idx)]
    # Computes the cumulative sum of the column
    result = reduce(lambda x, y: x + [x[-1]+y] if x else [y], df_notnull[col], [])
    # Creates a new column filled with NaN values
    df[f'cumsum_{col}'] = np.nan
    # Fill the cumsum column with the cumsum values at the right indices
    df.loc[df.index.isin(idx), f'cumsum_{col}'] = result
    # Ffill the missing values 
    df[f'cumsum_{col}'].fillna(method='ffill', inplace=True)
    return df

假设你的DataFrame名为df,这应该可以工作,但fillna(0)要好得多;)

相关问题