pandas 列表中 Dataframe 的python分段线性插值

6tr1vspr 于 2023-02-11 发布在 Python

关注(0)|答案(1)|浏览(181)

我正在尝试应用分段线性插值。我第一次尝试使用Pandas内置插值功能，但它不工作。
示例数据如下所示

import pandas as pd
import numpy as np

d = {'ID':[5,5,5,5,5,5,5], 'month':[0,3,6,9,12,15,18], 'num':[7,np.nan,5,np.nan,np.nan,5,8]}
tempo = pd.DataFrame(data = d)
d2 = {'ID':[6,6,6,6,6,6,6], 'month':[0,3,6,9,12,15,18], 'num':[5,np.nan,2,np.nan,np.nan,np.nan,7]}
tempo2 = pd.DataFrame(data = d2)
this = []
this.append(tempo)
this.append(tempo2)

实际数据有超过1000个唯一ID，所以我将每个ID过滤到一个 Dataframe 中，并将它们放入列表中。
列表中的第一个 Dataframe 如下

我试图遍历列表中的所有 Dataframe 来做分段线性插值。我试图将月份更改为索引并使用. interpolate（method ='index '，inplace = True），但它不起作用。
预期输出为
识别号|月|努姆
五个|无|七
五个|三个|六个
五个|六个|五个
五个|九|五个
五个|十二|五个
五个|十五|五个
五个|十八|八个
这需要应用于列表中的所有 Dataframe 。

pandas

来源：https://stackoverflow.com/questions/75407266/python-piecewise-linear-interpolation-across-dataframes-in-a-list

1条答案

按热度按时间

iqjalb3h1#

假设这是your previous question的后续版本，请将代码更改为：

for i, df in enumerate(this):
    this[i] = (df
        .set_index('month')
        # optional, because of the previous question
        .reindex(range(df['month'].min(), df['month'].max()+3, 3))
        .interpolate()
        .reset_index()[df.columns]
        )

注意：我简化了代码，删除了groupby，它只在每个DataFrame只有一个组时才有效，正如您在另一个问题中提到的。*

输出：

[   ID  month  num
0   5      0  7.0
1   5      3  6.0
2   5      6  5.0
3   5      9  5.0
4   5     12  5.0
5   5     15  5.0
6   5     18  8.0,
   ID  month   num
0   6      0  5.00
1   6      3  3.50
2   6      6  2.00
3   6      9  3.25
4   6     12  4.50
5   6     15  5.75
6   6     18  7.00]

赞(0）回复(0）举报 2023-02-11

我来回答

pandas 列表中 Dataframe 的python分段线性插值

1条答案

相关问题

热门标签

最新问答