pandas 将缺失值替换为具有最小差异和的列的值

shyt4zoc 于 2023-03-11 发布在其他

关注(0)|答案(1)|浏览(131)

我有下面的数据框。

# Create a sample DataFrame
df = pd.DataFrame({'Age': [np.nan, 31, 29, 43, np.nan],
                   'Weight': [np.nan, 100, 60, 75, np.nan],
                   'Height': [1.65, 1.64, 1.75, 1.70, 1.68],
                   'BMI': [19, 15, 10, 25, 30]})

和要替换其缺少值的列：
case_columns = ['Age', 'Weight']
我想要一个算法-在python-这将取代相同的值与行的缺失值：缺失值所在行与其他行之间差值的最小和。
在我的示例中，在第0行中，年龄应该是31岁，体重应该是100岁，与第1行具有最小差值（（1.65-164）+（19-15））。在第4行中，年龄应该是43岁，体重应该是75岁。
在Python中如何实现这一点？

pandas

来源：https://stackoverflow.com/questions/75688214/replace-missing-values-with-the-value-of-the-column-with-the-minimum-sum-of-diff

1条答案

按热度按时间

fnx2tebb1#

您可以尝试创建一个函数并使用df.apply（）

def fill_missing(x):
    # if age or weight are missing
    if any(np.isnan(x.drop('Height'))):
        # create series df height - row height (exlude current row)
        height_diff = np.abs(df.drop(x.name)['Height'] - x['Height'])
        # get row index of minimum (obs: remember to use abs)
        row_idx = height_diff.idxmin()
        # substitute whatever is missing
        for feature in x.index:
            if np.isnan(x[feature]):
                x[feature] = df.loc[row_idx][feature]
    return x

df.apply(fill_missing, axis=1)

# if you want to change the value of df
df = df.apply(fill_missing, axis=1)

赞(0）回复(0）举报 2023-03-11

我来回答

pandas 将缺失值替换为具有最小差异和的列的值

1条答案

相关问题

热门标签

最新问答