Numpy等效于lambda中的pd. currame.shift

t30tvxxf  于 12个月前  发布在  其他
关注(0)|答案(1)|浏览(69)

我想把lambda应用到一个滚动嵌套框架(* 这就是我想要的 *):

window_size = 10
df = pd.DataFrame.from_dict({'A': np.random.rand(100), 'B': np.random.rand(100)+1})
def delta(x, d: int):
  return x - x.shift(d)  # I need shift here, not roll. That is, the beginning elements SHOULD be NaN

ck = lambda x: (x.min()-x.values[-1]) if delta(x.mean(),2*window_size) <= 0.05 * x.values[0] \
          else (-1)*delta(x,int(window_size/2.0))
a2 = df['A'].shift(1).rolling(2*window_size).apply(ck)

字符串
它返回以下错误:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 1913, in apply
    return super().apply(
  File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 1390, in apply
    return self._apply(
  File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 615, in _apply
    return self._apply_blockwise(homogeneous_func, name, numeric_only)
  File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 468, in _apply_blockwise
    return self._apply_series(homogeneous_func, name)
  File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 452, in _apply_series
    result = homogeneous_func(values)
  File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 610, in homogeneous_func
    result = calc(values)
  File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 607, in calc
    return func(x, start, end, min_periods, *numba_args)
  File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 1417, in apply_func
    return window_func(values, begin, end, min_periods)
  File "pandas/_libs/window/aggregations.pyx", line 1423, in pandas._libs.window.aggregations.roll_apply
  File "<stdin>", line 1, in <lambda>
  File "<stdin>", line 2, in delta
AttributeError: 'numpy.float64' object has no attribute 'shift'


然后我更新了delta函数:(这个函数没有错误,但结果不正确。

def delta(x, d: int):
  return x - np.roll(x,d)


它运行时没有任何错误。但是,结果对于a2[:2*window_size]是不正确的,因为函数delta在应该返回np.NaN的时候没有返回np.NaN。(roll返回[3, 4, 5, 1, 2],而我期望shift返回[nan, nan, nan, 1, 2]
然后,我在delta函数中手动将x的某些部分设置为np.NaN

def delta(x, d: int):
  x[-d:] = np.NaN # also tried x.values[-d:] = np.nan, but did not work either
  return x - np.roll(x,d)


但是它返回了这个错误:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 1913, in apply
    return super().apply(
  File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 1390, in apply
    return self._apply(
  File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 615, in _apply
    return self._apply_blockwise(homogeneous_func, name, numeric_only)
  File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 468, in _apply_blockwise
    return self._apply_series(homogeneous_func, name)
  File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 452, in _apply_series
    result = homogeneous_func(values)
  File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 610, in homogeneous_func
    result = calc(values)
  File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 607, in calc
    return func(x, start, end, min_periods, *numba_args)
  File "/home/my_user/.local/lib/python3.9/site-packages/pandas/core/window/rolling.py", line 1417, in apply_func
    return window_func(values, begin, end, min_periods)
  File "pandas/_libs/window/aggregations.pyx", line 1423, in pandas._libs.window.aggregations.roll_apply
  File "<stdin>", line 1, in <lambda>
  File "<stdin>", line 2, in delta
TypeError: 'numpy.float64' object does not support item assignment


如何解决这个问题?(顺便说一句,我从this post学习了rolling+lambda)

更新:通过delta函数,我想得到df.rolling().mean()df.rolling().mean().shift()之间的差异。然而,在上面的lambda ck中,x.mean()变成了标量,shift/roll对它不起作用。我如何修复它?

5ktev3wc

5ktev3wc1#

在我看来,你需要首先计算列的平均值。我找不到任何例子来说明如何在滚动中一次访问所有列。因此,我向delta发送一个序列,在函数中,我接收窗口中的最后一个index,然后直接转到 Dataframe ,计算所需的值(the average in the window - average with a shift d)。
你想在这里得到什么:else (-1)*delta(x,int(window_size/2.0))
如果能看到预期的结果就好了

import pandas as pd
import numpy as np

df['ma'] = df['A'].shift(1).rolling(2 * window_size).mean()

def delta(x, d: int):
    return df.loc[x.index[-1], 'ma'] - df.loc[d, 'ma']

ck = lambda x: (x.min() - x.values[-1]) if delta(x, 2 * window_size) <= 0.05 * x.values[0] \
    else (-1) * delta(x, int(window_size / 2.0))

a2 = df['A'].shift(1).rolling(2 * window_size).apply(ck)

字符串

相关问题