R -如何计算两个日期时间之间的小时和日期的ID计数

kokeuurv  于 2023-05-20  发布在  其他
关注(0)|答案(1)|浏览(181)

我有一个不同的ID,有一个开始日期,时间和结束日期,时间,我希望他们的工作频率按小时按日期以下数据。我在为不到5分钟的时间而奔波。ID可以登录和注销,参见ID 2和ID 6

structure(list(ID = c(1, 2, 2, 3, 4, 5, 6, 6, 7), Start = structure(c(1682980601.9993, 
1682976528.9996, 1682981213.9999, 1682976550, 1682977034.9995, 
1682976647.9992, 1682974125.9996, 1682976497.9993, 1682979153.9993
), class = c("POSIXct", "POSIXt"), tzone = "UTC"), End = structure(c(1683009115.9995, 
1682980675.9992, 1683010288.9996, 1683010949.9993, 1683008750.9992, 
1683007497, 1682976313.9993, 1683008357.9993, 1683008503.9998
), class = c("POSIXct", "POSIXt"), tzone = "UTC")), row.names = c(NA, 
-9L), class = c("tbl_df", "tbl", "data.frame"))

下面是我正在寻找的输出:

c9x0cxw0

c9x0cxw01#

我们可以先截断开始和结束时间,根据您的5分钟规则调整结束时间。这允许我们创建从开始到结束时间的滚动时间序列,并最终通过计数不同的ID获得所需的输出。

library(dplyr)
df %>%
  mutate(Start_hour=as.POSIXct(trunc(Start,units="hours")), #truncate start time to hour
         End_hour=as.POSIXct(trunc(End,units="hours")), #truncate end time to hour
         End_hour=case_when(End-End_hour>5~End_hour, #5-min rule
                            T~End_hour-3600)) %>%
  rowwise() %>%
  do(data.frame(ID=.$ID, time=seq(.$Start_hour,.$End_hour,by="1 hour"))) %>%  #get rolling sequence
  group_by(time) %>%
  summarise(n=n_distinct(ID)) #count distinct ID

# A tibble: 11 x 2
   time                    n
   <dttm>              <int>
 1 2023-05-01 20:00:00     1
 2 2023-05-01 21:00:00     5
 3 2023-05-01 22:00:00     7
 4 2023-05-01 23:00:00     7
 5 2023-05-02 00:00:00     7
 6 2023-05-02 01:00:00     7
 7 2023-05-02 02:00:00     7
 8 2023-05-02 03:00:00     7
 9 2023-05-02 04:00:00     7
10 2023-05-02 05:00:00     7
11 2023-05-02 06:00:00     6

相关问题