根据从索引日期开始的天数制作 Dataframe 宽格式

zc0qhyus  于 2023-04-18  发布在  其他
关注(0)|答案(1)|浏览(105)

我试图弄清楚如何为每个客户创建120天宽格式内的多个客户订单。如果客户在一年内下了订单,则有多个120天的间隔,因此每个客户可能有多个行,每行对应于从索引日期起120天的间隔。
我有一个dataframe:

customerid <- c("A1", "A1", "A1", "A1", "A1", "A2", "A2", "A2")
orderid <- c("1", "2", "3", "4", "5", "6", "7", "8")
orderdate <- c("2018-09-14", "2018-09-15", "2019-09-15", "2020-01-02", "2020-01-03", "2018-08-10", "2020-08-21","2020-08-22")
df <- data.frame(customerid, orderid, orderdate)

结果应为:

谢谢大家!
编辑:索引日期将是客户ID的下一个日期,该日期福尔斯上一个间隔的120天范围。谢谢!

tmb3ates

tmb3ates1#

library(dplyr) # v1.1.0+ for .by
df %>%
  mutate(orderdate = as.Date(orderdate),
         customer_era = cumsum(orderdate - lag(orderdate, default = as.Date("2000-01-01")) > 120),
         .by = customerid) %>%
  mutate(instance = row_number(), .by = c(customerid, customer_era)) %>%
  tidyr::pivot_wider(names_from = instance, names_glue = "{.value}_{.name}",
              values_from = c(orderid, orderdate))

或者使用早期dplyr的版本:

df %>%
  group_by(customerid) %>%
  mutate(orderdate = as.Date(orderdate),
         customer_era = cumsum(orderdate - lag(orderdate, default = as.Date("2000-01-01")) > 120)) %>%
  group_by(customerid, customer_era) %>%
  mutate(instance = row_number()) %>%
  ungroup() %>%
  tidyr::pivot_wider(names_from = instance, names_glue = "{.value}_{.name}",
              values_from = c(orderid, orderdate))

结果

# A tibble: 4 × 8
  customerid customer_era orderid_orderid_1 orderid_orderid_2 orderid_orderid_3 orderdate_orderdate_1 orderdate_orderdate_2 orderdate_orderdate_3
  <chr>             <int> <chr>             <chr>             <chr>             <date>                <date>                <date>               
1 A1                    1 1                 2                 NA                2018-09-14            2018-09-15            NA                   
2 A1                    2 3                 4                 5                 2019-09-15            2020-01-02            2020-01-03           
3 A2                    1 6                 NA                NA                2018-08-10            NA                    NA                   
4 A2                    2 7                 8                 NA                2020-08-21            2020-08-22            NA

相关问题