我们需要求出Pandas列的平均值

lhcgjxsq  于 2023-02-17  发布在  其他
关注(0)|答案(2)|浏览(125)
time1         x     y        z         GMT- 5             key     time2        a          b          c           GMT                   cut_off   time_diff new_column
1   1.674841e+09    -1.10   64.11   -1.33   2023-01-27 12:43:22 PM  0   1.674841e+09    2.96    606.270614  2.80    2023-01-27 12:43:24 PM  1.674841e+09    2.308100    NaN
2   1.674841e+09    -1.10   64.11   -1.33   2023-01-27 12:43:22 PM  0   1.674841e+09    2.96    584.696883  2.80    2023-01-27 12:43:26 PM  1.674841e+09    4.303636    NaN
3   1.674841e+09    -1.10   64.11   -1.33   2023-01-27 12:43:22 PM  0   1.674841e+09    2.96    615.295633  2.80    2023-01-27 12:43:28 PM  1.674841e+09    6.298568    NaN
4   1.674841e+09    -1.10   64.11   -1.33   2023-01-27 12:43:22 PM  0   1.674841e+09    2.96    587.050575  2.80    2023-01-27 12:43:30 PM  1.674841e+09    8.293623    NaN
5   1.674841e+09    -2.24   93.51   -2.36   2023-01-27 12:43:46 PM  0   1.674841e+09    2.96    584.700016  2.80    2023-01-27 12:43:46 PM  1.674841e+09    0.007554    0.007554
100 1.674842e+09    -1.24   84.73   -2.44   2023-01-27 12:49:07 PM  0   1.674843e+09    2.30    1024.363758 2.64    2023-01-27 01:13:11 PM  1.674843e+09    1444.068500 NaN
101 1.674842e+09    -1.24   84.73   -2.44   2023-01-27 12:49:07 PM  0   1.674843e+09    2.31    1011.438119 2.64    2023-01-27 01:13:13 PM  1.674843e+09    1446.063470 NaN
102 1.674842e+09    -1.24   84.73   -2.44   2023-01-27 12:49:07 PM  0   1.674843e+09    2.32    1005.181835 2.64    2023-01-27 01:13:15 PM  1.674843e+09    1448.058710 NaN
103 1.674842e+09    -1.24   84.73   -2.44   2023-01-27 12:49:07 PM  0   1.674843e+09    2.34    989.515657  2.64    2023-01-27 01:13:17 PM  1.674843e+09    1450.053643 NaN
104 1.674842e+09    -1.24   84.73   -2.44   2023-01-27 12:49:07 PM  0   1.674843e+09    2.34    1016.183097 2.64    2023-01-27 01:13:19 PM  1.674843e+09    1452.048679 NaN
105 1.674842e+09    -1.57   80.04   -1.96   2023-01-27 12:49:06 PM  0   1.674842e+09    2.02    1652.185708 2.88    2023-01-27 12:49:06 PM  1.674842e+09    0.001867    0.001867

我们实际上需要列中没有nan值的行:“new_column”。下面是行:5和105,但是我们需要第5行和第105行中的(行1至5)和(行100至105)的“x”、“y”、“z”的平均值
所需输出:

time1            x     y        z         GMT- 5             key     time2        a          b          c           GMT                   cut_off   time_diff new_column
    5   1.674841e+09    -1.328   69.99   -1.536   2023-01-27 12:43:46 PM  0   1.674841e+09    2.96    584.700016  2.80    2023-01-27 12:43:46 PM  1.674841e+09    0.007554    0.007554
    105 1.674842e+09    -1.295   69.82   -2.36  2023-01-27 12:49:06 PM  0   1.674842e+09    2.02    1652.185708 2.88    2023-01-27 12:49:06 PM  1.674842e+09    0.001867    0.001867
oknrviil

oknrviil1#

可能是这个。

df['new_column'] = df[['x', 'y', 'z']].mean(axis=1)
2ic8powd

2ic8powd2#

首先让我们尝试创建一个组。我们可以使用累积和在"新列"上完成此操作。只需将Nan值替换为0,将其他值替换为1,并将其下移1

df["binary"] = df["new_column"].fillna(0)
df.loc[df.binary!=0,"binary"] = 1
df["binary"] = df["binary"].shift(1,fill_value=0)
df["cumsum"] = df["binary"].cumsum()
time1     x      y     z  ...    time_diff new_column binary  cumsum
0   1.670000e+09 -1.10  64.11 -1.33  ...     2.308100        NaN    0.0     0.0
1   1.670000e+09 -1.10  64.11 -1.33  ...     4.303636        NaN    0.0     0.0
2   1.670000e+09 -1.10  64.11 -1.33  ...     6.298568        NaN    0.0     0.0
3   1.670000e+09 -1.10  64.11 -1.33  ...     8.293623        NaN    0.0     0.0
4   1.670000e+09 -2.24  93.51 -2.36  ...     0.007554   0.007554    0.0     0.0
5   1.670000e+09 -1.24  84.73 -2.44  ...  1444.068500        NaN    1.0     1.0
6   1.670000e+09 -1.24  84.73 -2.44  ...  1446.063470        NaN    0.0     1.0
7   1.670000e+09 -1.24  84.73 -2.44  ...  1448.058710        NaN    0.0     1.0
8   1.670000e+09 -1.24  84.73 -2.44  ...  1450.053643        NaN    0.0     1.0
9   1.670000e+09 -1.24  84.73 -2.44  ...  1452.048679        NaN    0.0     1.0
10  1.670000e+09 -1.57  80.04 -1.96  ...     0.001867   0.001867    0.0     1.0

在此之后,它是对累积和的简单groupby
一个二个一个一个

相关问题