pandas dataframe.sum()结果是否已更改?

6l7fqoea  于 2023-04-28  发布在  其他
关注(0)|答案(1)|浏览(93)

我有以下代码,最后一次使用是在几个月前:

opendf = pd.DataFrame(
    rdbin[0],
    columns=[
        "highpoint_sum",
        "highpoint_mean",
        "highpoint diff_sum",
        "highpoint diff_mean",
        "name",
        "bin",
    ],
)
opendf.index = opendf["bin"]
opendf.drop(
    columns=["highpoint_sum", "highpoint_mean", "highpoint diff_mean", "name", "bin"],
    inplace=True,
)
opendf["Bresser_sum"] = brbinarr[:, 2]
opendf["open gauge_sum"] = rdbin[21][:, 2]
opendf["br_open_sum"] = opendf[["Bresser_sum", "open gauge_sum"]].sum(
    axis=1, min_count=1
)

当cols 'Bresser_sum'和'open gauge_sum'都是NaN时,它会导致列'br_open_sum'具有NaN。

highpoint diff_sum Bresser_sum open gauge_sum br_open_sum
bin                                                                          
2021-07-19 00:00:00                0.0         NaN            NaN        None
2021-07-19 01:00:00                0.0         NaN            NaN        None
2021-07-19 02:00:00                0.0         NaN            NaN        None
2021-07-19 03:00:00                0.0         NaN            NaN        None
2021-07-19 04:00:00                0.0         NaN            NaN        None
2021-07-19 05:00:00                0.0         NaN            NaN        None
2021-07-19 06:00:00                0.0         NaN            NaN        None
2021-07-19 07:00:00                0.0         NaN            NaN        None
2021-07-19 08:00:00                0.0         NaN            NaN        None
2021-07-19 09:00:00                0.0         NaN            NaN        None
2021-07-19 10:00:00                0.0         NaN            NaN        None
2021-07-19 11:00:00                0.0           0            NaN           0
2021-07-19 12:00:00                0.0         0.0            NaN         0.0
2021-07-19 13:00:00                0.0         0.0            NaN         0.0
2021-07-19 14:00:00                0.0         0.0            NaN         0.0
2021-07-19 15:00:00                0.0         0.0            NaN         0.0
2021-07-19 16:00:00                0.0         0.0            NaN         0.0
2021-07-19 17:00:00                0.0         0.0            NaN         0.0
2021-07-19 18:00:00                0.0         0.0            NaN         0.0
2021-07-19 19:00:00                0.0         0.0            NaN         0.0
2021-07-19 20:00:00                0.0         0.0            NaN         0.0
2021-07-19 21:00:00                0.0         0.0            NaN         0.0
2021-07-19 22:00:00                0.0         0.0            NaN         0.0
2021-07-19 23:00:00                0.0         0.0            NaN         0.0

现在,当我在数据没有任何变化的情况下运行代码时,但是我已经从python3.9升级到了3.11,我在列中得到'None'(一个NoneType对象)(如上所示)
我如何让.sum()语句返回到以前的行为?我已经阅读了文档pandas dataframe.sum(),并尝试了skipna,numeric_only和min_count的所有组合。
用NaN替换'None'是另一种选择,我尝试使用.fillna(),但它抛出了大量错误!

k10s72fa

k10s72fa1#

使用以下玩具数据框:

import pandas as pd

df = pd.DataFrame(
    {
        "col1": [0.0, 0.0, 0.0],
        "col2": [np.NaN, np.NaN, np.NaN],
        "col3": [None, None, "0.0"],
    }
)

print(df)
# Output

   col1  col2  col3
0   0.0   NaN  None
1   0.0   NaN  None
2   0.0   NaN   0.0

您可以使用Pandas to_numeric方法转换None值:

df["col3"] = pd.to_numeric(df["col3"])

print(df)
# Output

   col1  col2  col3
0   0.0   NaN   NaN
1   0.0   NaN   NaN
2   0.0   NaN   0.0

相关问题