使用cut和break将日期分组为周

xytpbqjk  于 2023-04-18  发布在  其他
关注(0)|答案(3)|浏览(99)

我有一些数据看起来像这样:

#   order_date quantity
# 1 2021-01-01       54
# 2 2021-01-01       32
# 3 2021-01-02       42
# 4 2021-01-01      132
# 5 2021-01-01       56
# 6 2021-01-02       88
# 7 2021-01-08       99
# 8 2021-01-10       54

当我使用下面的代码:

df$week <- cut(as.Date(df$order_date), breaks="week")

我得到以下结果:

#   order_date quantity       week
# 1 2021-01-01       54 2020-12-28
# 2 2021-01-01       32 2020-12-28
# 3 2021-01-02       42 2020-12-28
# 4 2021-01-01      132 2020-12-28
# 5 2021-01-01       56 2020-12-28
# 6 2021-01-02       88 2020-12-28
# 7 2021-01-08       99 2021-01-04
# 8 2021-01-10       54 2021-01-04

由于我的数据从1/1/21开始,我希望周分组从1/1/21开始,而不是从12/28/2020(最近的星期日)开始。因此我的组如下所示:

#   order_date quantity       week
# 1 2021-01-01       54 2021-01-01
# 2 2021-01-01       32 2021-01-01
# 3 2021-01-02       42 2021-01-01
# 4 2021-01-01      132 2021-01-01
# 5 2021-01-01       56 2021-01-01
# 6 2021-01-02       88 2021-01-01
# 7 2021-01-08       99 2021-01-07
# 8 2021-01-10       54 2021-01-07

开放给其他库/语法。

wlwcrazw

wlwcrazw1#

您可以在日期范围加上一周内使用seq.Dat。不需要软件包。

dat |> 
  transform(week=cut(order_date,
                     breaks=seq.Date(min(order_date), max(order_date) + 7, 
                                     by='week')))
#    order_date quantity       week
# 1  2021-01-01       54 2021-01-01
# 2  2021-01-01       32 2021-01-01
# 3  2021-01-01       42 2021-01-01
# 4  2021-01-01      132 2021-01-01
# 5  2021-01-01       56 2021-01-01
# 6  2021-01-02       88 2021-01-01
# 7  2021-01-03       99 2021-01-01
# 8  2021-01-03       54 2021-01-01
# 9  2021-01-08       23 2021-01-08
# 10 2021-01-10       11 2021-01-08

**注:**R〉= 4.1。

  • 数据:*
dat <- structure(list(order_date = structure(c(18628, 18628, 18628, 
18628, 18628, 18629, 18630, 18630, 18635, 18637), class = "Date"), 
    quantity = c(54, 32, 42, 132, 56, 88, 99, 54, 23, 11)), class = "data.frame", row.names = c(NA, 
-10L))
elcex8rz

elcex8rz2#

您可以使用lubridate::floor_date手动设置一周的第一天。

dat$Week <- lubridate::floor_date(dat$order_date, "weeks", week_start = 5)

> dat
#   order_date quantity       week
#1  2021-01-01       54 2021-01-01
#2  2021-01-01       32 2021-01-01
#3  2021-01-01       42 2021-01-01
#4  2021-01-01      132 2021-01-01
#5  2021-01-01       56 2021-01-01
#6  2021-01-02       88 2021-01-01
#7  2021-01-03       99 2021-01-01
#8  2021-01-03       54 2021-01-01
#9  2021-01-02       23 2021-01-01
#10 2021-01-10       11 2021-01-08

数据

order_date <- c("2021-01-01", "2021-01-01","2021-01-01","2021-01-01","2021-01-01","2021-01-02","2021-01-03","2021-01-03","2021-01-02","2021-01-10")
quantity <- c(54,32,42,132,56,88,99,54,23,11)
dat <- data.frame(order_date=as.Date(order_date), quantity)
ppcbkaq5

ppcbkaq53#

使用我的包timeplyr的方法,除非另有说明,否则总是使用开始日期来构建序列。
time_summarisev()内部使用findInterval()

# remotes::install_github("NicChr/timeplyr")
library(timeplyr)
dat$week <- time_summarisev(dat$order_date, by = "week", 
                            unique = FALSE, sort = FALSE)
dat
#>    order_date quantity       week
#> 1  2021-01-01       54 2021-01-01
#> 2  2021-01-01       32 2021-01-01
#> 3  2021-01-01       42 2021-01-01
#> 4  2021-01-01      132 2021-01-01
#> 5  2021-01-01       56 2021-01-01
#> 6  2021-01-02       88 2021-01-01
#> 7  2021-01-03       99 2021-01-01
#> 8  2021-01-03       54 2021-01-01
#> 9  2021-01-02       23 2021-01-01
#> 10 2021-01-10       11 2021-01-08

还支持多单元周聚合。

dat$week2 <- time_summarisev(dat$order_date, by = "2 weeks", 
                             unique = FALSE, sort = FALSE)
dat$Week2 <- lubridate::floor_date(dat$order_date, "2 weeks", week_start = 5)
#> Error in validate_rounding_nunit(.Call(C_parse_unit, as.character(unit))): Rounding with week > 1 is not supported. Use aseconds for arbitrary units.
dat
#>    order_date quantity       week      week2
#> 1  2021-01-01       54 2021-01-01 2021-01-01
#> 2  2021-01-01       32 2021-01-01 2021-01-01
#> 3  2021-01-01       42 2021-01-01 2021-01-01
#> 4  2021-01-01      132 2021-01-01 2021-01-01
#> 5  2021-01-01       56 2021-01-01 2021-01-01
#> 6  2021-01-02       88 2021-01-01 2021-01-01
#> 7  2021-01-03       99 2021-01-01 2021-01-01
#> 8  2021-01-03       54 2021-01-01 2021-01-01
#> 9  2021-01-02       23 2021-01-01 2021-01-01
#> 10 2021-01-10       11 2021-01-08 2021-01-01

相关问题