如何利用R中的dplyr包计算基于模式的连休日

7z5jn7bk  于 2023-02-17  发布在  其他
关注(0)|答案(1)|浏览(104)

据Maven介绍,节假日影响一家店铺的销售,连休期间销售额高,连休定义如下:

    • 将两个节假日之间的非节假日视为节假日**。

如果一天不是假日,那它的前后都有假日;在HL_UP列中,该天被视为假日。在HL_DONW列中,计算假日的天数。我们只有变量IS_HOLIDAY,如何使用dplyr程序包计算R中的HL_UP和HL_DONW列?
数据如下:

data_temp <- structure(list(date_1 = c("2021-01-01", "2021-01-02", "2021-01-03", 
"2021-01-04", "2021-01-05", "2021-01-06", "2021-01-07", "2021-01-08", 
"2021-01-09", "2021-01-10", "2021-01-11", "2021-01-12", "2021-01-13", 
"2021-01-14", "2021-01-15", "2021-01-16", "2021-01-17", "2021-01-18", 
"2021-01-19", "2021-01-20", "2021-01-21", "2021-01-22", "2021-01-23", 
"2021-01-24", "2021-01-25", "2021-01-26", "2021-01-27", "2021-01-28"
), IS_HOLIDAY = c(0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 
0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1)), class = "data.frame", row.names = c(NA, 
-28L))

在第一步,我试图定义新的列,找到两个假期之间的日子:

data_temp2<-data_temp %>% mutate(
  HL=if_else( IS_HOLIDAY == 0 & lag(IS_HOLIDAY,1)==1 & lead(IS_HOLIDAY,1)==1,1,IS_HOLIDAY))

下面是输出:

我仍然不知道如何计算HL_UP和HL_DOWN。
编辑:
我想我找到解决办法了:

data_temp3<-data_temp2 %>%
  group_by(group_seq = with(rle(HL), rep(seq_along(lengths), lengths))) %>%
  mutate(HL_UP=sum(HL)) %>%
  mutate(HL_DOWN=sum(IS_HOLIDAY))

下面是输出:

我不理解这部分代码:

with(rle(...), rep(seq_along(lengths), lengths))
mwkjh3gx

mwkjh3gx1#

我们还可以在这里使用zoo::rollapply来确定IS_HOLIDAY是否应该更改,然后计算行数(对于UP)并对IS_HOLIDAY求和(对于DOWN)。

data_temp %>%
  mutate(holiday2 = zoo::rollapply(IS_HOLIDAY, 3, FUN = function(z) case_when(length(z) <= 2 ~ z[length(z)], TRUE ~ if_else((all(z[c(1,3)] > 0) || z[2] > 0), 1, z[2])), align = "center", partial = TRUE)) %>%
  group_by(grp = cumsum(c(FALSE, diff(holiday2) != 0))) %>%
  mutate(HL_DOWN = if_else(first(IS_HOLIDAY) > 0, n(), 0L), HL_UP = if_else(first(IS_HOLIDAY) > 0, sum(IS_HOLIDAY > 0), 0L)) %>%
  ungroup() %>%
  select(-grp) %>%
  print(n=99)
# # A tibble: 28 × 5
#    date_1     IS_HOLIDAY holiday2 HL_DOWN HL_UP
#    <chr>           <dbl>    <dbl>   <int> <int>
#  1 2021-01-01          0        0       0     0
#  2 2021-01-02          0        0       0     0
#  3 2021-01-03          0        0       0     0
#  4 2021-01-04          0        0       0     0
#  5 2021-01-05          0        0       0     0
#  6 2021-01-06          0        0       0     0
#  7 2021-01-07          1        1       1     1
#  8 2021-01-08          0        0       0     0
#  9 2021-01-09          0        0       0     0
# 10 2021-01-10          0        0       0     0
# 11 2021-01-11          0        0       0     0
# 12 2021-01-12          1        1       3     2
# 13 2021-01-13          0        1       3     2
# 14 2021-01-14          1        1       3     2
# 15 2021-01-15          0        0       0     0
# 16 2021-01-16          0        0       0     0
# 17 2021-01-17          1        1       5     4
# 18 2021-01-18          1        1       5     4
# 19 2021-01-19          0        1       5     4
# 20 2021-01-20          1        1       5     4
# 21 2021-01-21          1        1       5     4
# 22 2021-01-22          0        0       0     0
# 23 2021-01-23          0        0       0     0
# 24 2021-01-24          0        0       0     0
# 25 2021-01-25          0        0       0     0
# 26 2021-01-26          0        0       0     0
# 27 2021-01-27          1        1       2     2
# 28 2021-01-28          1        1       2     2

相关问题