我想把lambda应用到一个滚动嵌套框架(* 这就是我想要的 *):
window_size = 10
df = pd.DataFrame.from_dict({'A': np.random.rand(100), 'B': np.random.rand(100)+1})
def delta(x, d: int):
return x - x.shift(d) # I need shift here, not roll. That is, the beginning elements SHOULD be NaN
ck = lambda x: (x.min()-x.values[-1]) if delta(x.mean(),2*window_size) <= 0.05 * x.values[0] \
else (-1)*delta(x,int(window_size/2.0))
a2 = df['A'].shift(1).rolling(2*window_size).apply(ck)
字符串
它返回以下错误:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 1913, in apply
return super().apply(
File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 1390, in apply
return self._apply(
File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 615, in _apply
return self._apply_blockwise(homogeneous_func, name, numeric_only)
File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 468, in _apply_blockwise
return self._apply_series(homogeneous_func, name)
File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 452, in _apply_series
result = homogeneous_func(values)
File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 610, in homogeneous_func
result = calc(values)
File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 607, in calc
return func(x, start, end, min_periods, *numba_args)
File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 1417, in apply_func
return window_func(values, begin, end, min_periods)
File "pandas/_libs/window/aggregations.pyx", line 1423, in pandas._libs.window.aggregations.roll_apply
File "<stdin>", line 1, in <lambda>
File "<stdin>", line 2, in delta
AttributeError: 'numpy.float64' object has no attribute 'shift'
型
然后我更新了delta
函数:(这个函数没有错误,但结果不正确。)
def delta(x, d: int):
return x - np.roll(x,d)
型
它运行时没有任何错误。但是,结果对于a2[:2*window_size]
是不正确的,因为函数delta
在应该返回np.NaN
的时候没有返回np.NaN
。(roll
返回[3, 4, 5, 1, 2]
,而我期望shift
返回[nan, nan, nan, 1, 2]
)
然后,我在delta
函数中手动将x
的某些部分设置为np.NaN
def delta(x, d: int):
x[-d:] = np.NaN # also tried x.values[-d:] = np.nan, but did not work either
return x - np.roll(x,d)
型
但是它返回了这个错误:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 1913, in apply
return super().apply(
File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 1390, in apply
return self._apply(
File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 615, in _apply
return self._apply_blockwise(homogeneous_func, name, numeric_only)
File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 468, in _apply_blockwise
return self._apply_series(homogeneous_func, name)
File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 452, in _apply_series
result = homogeneous_func(values)
File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 610, in homogeneous_func
result = calc(values)
File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 607, in calc
return func(x, start, end, min_periods, *numba_args)
File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 1417, in apply_func
return window_func(values, begin, end, min_periods)
File "pandas/_libs/window/aggregations.pyx", line 1423, in pandas._libs.window.aggregations.roll_apply
File "<stdin>", line 1, in <lambda>
File "<stdin>", line 2, in delta
TypeError: 'numpy.float64' object does not support item assignment
型
如何解决这个问题?(顺便说一句,我从this post学习了rolling+lambda)
更新:通过delta
函数,我想得到df.rolling().mean()
和df.rolling().mean().shift()
之间的差异。然而,在上面的lambda ck
中,x.mean()
变成了标量,shift
/roll
对它不起作用。我如何修复它?
1条答案
按热度按时间5ktev3wc1#
在我看来,你需要首先计算列的平均值。我找不到任何例子来说明如何在滚动中一次访问所有列。因此,我向
delta
发送一个序列,在函数中,我接收窗口中的最后一个index
,然后直接转到 Dataframe ,计算所需的值(the average in the window - average with a shift d
)。你想在这里得到什么:
else (-1)*delta(x,int(window_size/2.0))
?如果能看到预期的结果就好了
字符串