pandas 提取 Dataframe 中每行的列名，这些列名不是NaN值(Python)

gcmastyq 于 2022-12-09 发布在 Python

关注(0)|答案(2)|浏览(147)

我有一个 Dataframe ，它有几个特征，一个特征可以有一个NaN值。

feature1    feature2    feature3   feature4
  10           NaN          5          2
  2            1            3          1
  NaN          2            4          NaN

注意：列也可以包含字符串。
我们如何获得每行包含非NaN值的列名的列表/数组？
因此，我的示例的结果数组将是：

res = array([feature1, feature3, feature4], [feature1, feature2, feature3, feature4], 
[feature2, feature3])

pandas

来源：https://stackoverflow.com/questions/74685052/fetch-the-column-names-per-row-in-a-dataframe-that-are-not-nan-values-python

2条答案

按热度按时间

ffx8fchx1#

为提高性能，请使用列表解析并将值转换为numpy数组：
第一个

赞(0）回复(0）举报 2022-12-09

2w2cym1i2#

您可以使用stack仅保留非NAN值，并使用groupby.agg聚合为列表：

out = df.stack().reset_index().groupby('level_0')['level_1'].agg(list)

输出为系列：

level_0
0              [feature1, feature3, feature4]
1    [feature1, feature2, feature3, feature4]
2                        [feature2, feature3]
Name: level_1, dtype: object

作为列表：

out = (df.stack().reset_index().groupby('level_0')['level_1']
         .agg(list).to_numpy().tolist()
       )

输出量：

[['feature1', 'feature3', 'feature4'],
 ['feature1', 'feature2', 'feature3', 'feature4'],
 ['feature2', 'feature3']]

赞(0）回复(0）举报 2022-12-09

我来回答

pandas 提取 Dataframe 中每行的列名，这些列名不是NaN值(Python)

2条答案

相关问题

热门标签

最新问答