Pandas：仅在 Dataframe 开始和结束时删除NaN

ddhy6vgd 于 2023-02-06 发布在其他

关注(0)|答案(4)|浏览(128)

我有一个Pandas数据框，看起来像这样：

我想只在开头和结尾处去掉NaN（即只保留1950年到1954年的值，包括NaN）。我已经尝试了.isnull()和dropna()，但不知何故，我找不到合适的解决方案。有人能帮忙吗？

pandas

来源：https://stackoverflow.com/questions/31510379/pandas-remove-nan-only-at-beginning-and-end-of-dataframe

4条答案

按热度按时间

50pmv0ei1#

使用专门为此设计的内置first_valid_index和last_valid_index并对df进行切片：

In [5]:

first_idx = df.first_valid_index()
last_idx = df.last_valid_index()
print(first_idx, last_idx)
df.loc[first_idx:last_idx]
1950 1954
Out[5]:
      sum
1950    5
1951    3
1952  NaN
1953    4
1954    8

赞(0）回复(0）举报 2023-02-06

ffx8fchx2#

这里有一个方法。

import pandas as pd

# your data
# ==============================
df

      sum
1948  NaN
1949  NaN
1950    5
1951    3
1952  NaN
1953    4
1954    8
1955  NaN

# processing
# ===============================
idx = df.fillna(method='ffill').dropna().index
res_idx = df.loc[idx].fillna(method='bfill').dropna().index
df.loc[res_idx]

      sum
1950    5
1951    3
1952  NaN
1953    4
1954    8

赞(0）回复(0）举报 2023-02-06

olmpazwi3#

下面是Numpy的一种方法：

import numpy as np

x    = np.logical_not(pd.isnull(df))
mask = np.logical_and(np.cumsum(x)!=0, np.cumsum(x[::-1])[::-1]!=0)

In [313]: df.loc[mask['sum'].tolist()]

Out[313]:
      sum
1950    5
1951    3
1952  NaN
1953    4
1954    8

赞(0）回复(0）举报 2023-02-06

zwghvu4y4#

一行：

df.query('~@df.ffill().isna().any(axis=1)&~@df.bfill().isna().any(axis=1)')

赞(0）回复(0）举报 2023-02-06

我来回答

Pandas：仅在 Dataframe 开始和结束时删除NaN

4条答案

相关问题

热门标签

最新问答