Pandas在groupby内对一列进行插值

nqwrtyyt  于 2022-11-27  发布在  其他
关注(0)|答案(1)|浏览(266)

类似于这个问题Pandas interpolate within a groupby,但是这个问题的答案是对所有列执行interpolate()。如果我只想将interpolate()限制在一列,该如何做?
输入

filename    val1    val2
t                   
1   file1.csv   5       10
2   file1.csv   NaN     NaN
3   file1.csv   15      20
6   file2.csv   NaN     NaN
7   file2.csv   10      20
8   file2.csv   12      15

预期输出

filename    val1    val2
t                   
1   file1.csv   5       10
2   file1.csv   NaN     15
3   file1.csv   15      20
6   file2.csv   NaN     NaN
7   file2.csv   10      20
8   file2.csv   12      15

此尝试仅返回val2列,而不返回其余列。

df = df.groupby('filename').apply(lambda group: group['val2'].interpolate(method='index'))
fcg9iug3

fcg9iug31#

直接方法:

df = pd.read_clipboard() # clipboard contains OP sample data
# interpolate only on col "val2"
df["val2_interpolated"] = df[["filename","val2"]].groupby('filename')
.apply(lambda x:x) # WTF
.interpolate(method='linear')["val2"]

退货:

filename  val1  val2  val2_interpolated
t
1  file1.csv   5.0  10.0               10.0
2  file1.csv   NaN   NaN               15.0
3  file1.csv  15.0  20.0               20.0
6  file2.csv   NaN   NaN               20.0
7  file2.csv  10.0  20.0               20.0
8  file2.csv  12.0  15.0               15.0

相关问题