Pandas:按行的标准列出的每行在过去N小时内的出现次数

oyxsuwqo  于 2022-12-16  发布在  其他
关注(0)|答案(1)|浏览(107)

给定此数据集:

ID timestamp           type
 1 2022-12-12 01:00:00 TypeA
 2 2022-12-12 05:00:00 TypeA
 3 2022-12-12 06:00:00 TypeA
 4 2022-12-12 07:00.00 TypeB
 5 2022-12-13 00:00:00 TypeA
 6 2022-12-13 02:00:00 TypeB
 7 2022-12-13 23:00:00 TypeA

对于每一行,我想计算最近N小时内相同类型的行数,在本例中,N=24h,计算结果为:

ID timestamp           type  count
 1 2022-12-12 01:00:00 TypeA 0
 2 2022-12-12 05:00:00 TypeA 1
 3 2022-12-12 06:00:00 TypeA 2
 4 2022-12-12 07:00.00 TypeB 0
 5 2022-12-13 00:00:00 TypeA 3
 6 2022-12-13 02:00:00 TypeB 1
 7 2022-12-13 23:00:00 TypeA 1
fzsnzjdm

fzsnzjdm1#

我不知道这是不是你想要的?

In [96]: df.set_index("datetime").groupby("type").rolling("24h").count()
Out[96]:
                            ID  timestamp
type  datetime
TypeA 2022-12-12 01:00:00  1.0        1.0
      2022-12-12 05:00:00  2.0        2.0
      2022-12-12 06:00:00  3.0        3.0
      2022-12-13 00:00:00  4.0        4.0
      2022-12-13 23:00:00  2.0        2.0
TypeB 2022-12-12 07:00:00  1.0        1.0
      2022-12-13 02:00:00  2.0        2.0

In [125]: df.set_index("datetime").groupby("type")['ID'].rolling("23h", min_periods=0, closed="left").agg({"count":"count"})
Out[125]:
                           count
type  datetime
TypeA 2022-12-12 01:00:00    0.0
      2022-12-12 05:00:00    1.0
      2022-12-12 06:00:00    2.0
      2022-12-13 00:00:00    3.0
      2022-12-13 23:00:00    1.0
TypeB 2022-12-12 07:00:00    0.0
      2022-12-13 02:00:00    1.0

相关问题