使用原点和关闭状态重新采样Pandas

9avjhtql  于 2022-10-23  发布在  其他
关注(0)|答案(2)|浏览(101)

我有一个库存数据框架,如下所示,

tidx = pd.date_range('2022-10-01', periods=15, freq='D') 
data_frame = pd.DataFrame(1, columns=['inventory'], index=tidx)
data_frame.iloc[-2:] = 0
print(data_frame)

inventory
2022-10-01          1
2022-10-02          1
2022-10-03          1
2022-10-04          1
2022-10-05          1
2022-10-06          1
2022-10-07          1
2022-10-08          1
2022-10-09          1
2022-10-10          1
2022-10-11          1
2022-10-12          1
2022-10-13          1
2022-10-14          0
2022-10-15          0

我想从一周中的任何一天开始累计7天(这里是2020-10-15)。如果我这样做,我不知道为什么结果从2020-10-13开始

data_frame.resample("7D", closed = 'right', origin='2020-10-15').sum()

inventory
2022-09-29  6
2022-10-06  7
2022-10-13  0

我期望的输出是,

inventory
2022-09-01  1
2022-10-08  7
2022-10-15  5

**注:**我的Pandas版本是“1.3.5”

bvn4nwqk

bvn4nwqk1#

data_frame.resample("7D", origin='2022-10-15', closed='right', loffset='7D').sum()

给予:

inventory
2022-10-01          1
2022-10-08          7
2022-10-15          5

尽管loffset已弃用。所以你可以这样做。

from pandas.tseries.frequencies import to_offset
df = data_frame.resample("7D", origin='2022-10-15', closed='right').sum()
df.index = df.index + to_offset("7D")

此外,一个更优雅的解决方案是:

data_frame.resample("7D", origin='end', closed='right').sum()
yzxexxkh

yzxexxkh2#

我找到了一个公平的解决方案,

data_frame.resample("W-SAT", closed = 'right').sum()

inventory
2022-10-01  1
2022-10-08  7
2022-10-15  5

唯一的警告是计算开始日期名称,即SAT(对于2022-10-15

相关问题