使用apply检索R中的日期向量列表

ulydmbyx  于 2023-06-19  发布在  其他
关注(0)|答案(2)|浏览(107)

我有一个数据框架df,有2列和10 k行:

df <- 
structure(list(date1 = c("2015-09-29", "2018-07-24", "2021-07-20", 
"2016-12-19"), date2 = c("2015-11-26", "2018-09-26", "2021-09-21", 
"2017-03-17")), class = "data.frame", row.names = c(NA, 4L))

| 日期1|日期2|
| - -----|- -----|
| 2015-09-29 2015-09-29| 2015-11-26 2015-11-26|
| 2018-07-24| 2018-09-26 2018-09-26|
| 2021-07- 20 2021-07-20 2021-07-20| 2021-09-21 - 2021-09-21|
| 2016-12-19 2016-12-19 2016-12-19| 2017-03-17|
我有一种方法来检索date 1和date 2之间的日期,使用一个循环:

library(lubridate)

p <- list()
for(i in 1:nrow(df)){
  p[[i]] <- seq(floor_date(df$date1[i], "months"), 
                      floor_date(df$date2[i], "months"), 
                      by = "months") %m+% months(1) - days(1)
}

因此,这个列表的元素数量与行的数据框一样多,每个元素都有一个向量(可能有不同的长度),向量之间的日期(日期之间是月的最后一天)。
关键是,我试图让它更高效,现在它大约需要1分钟,我知道apply(以及它家族的其他函数)比循环更高效,但它不工作。
我该如何实现这一点?

tzcvj98z

tzcvj98z1#

像这样的吗

df[] <- lapply(df, lubridate::floor_date, "months")
Map(seq.Date, df$date1, df$date2, "months")
#> [[1]]
#> [1] "2015-09-01" "2015-10-01" "2015-11-01"
#> 
#> [[2]]
#> [1] "2018-07-01" "2018-08-01" "2018-09-01"
#> 
#> [[3]]
#> [1] "2021-07-01" "2021-08-01" "2021-09-01"
#> 
#> [[4]]
#> [1] "2016-12-01" "2017-01-01" "2017-02-01" "2017-03-01"
pqwbnv8z

pqwbnv8z2#

如果执行时间很重要,请尝试data.table

library(data.table)

do.call(c,           # get a vector instead of a data.table element per entry
  split(data.table(  # construct the monthly date
    d1 = as.Date(paste0(sub("-\\d{2}$", "", df$date1), "-01")), 
    d2 = as.Date(paste0(sub("-\\d{2}$", "", df$date2), "-01")))[, 
      .(Dates = seq.Date(d1, d2, "month")), by=1:nrow(df)],
  by="nrow", keep.by=F))
$`1.Dates`
[1] "2015-09-01" "2015-10-01" "2015-11-01"

$`2.Dates`
[1] "2018-07-01" "2018-08-01" "2018-09-01"

$`3.Dates`
[1] "2021-07-01" "2021-08-01" "2021-09-01"

$`4.Dates`
[1] "2016-12-01" "2017-01-01" "2017-02-01" "2017-03-01"

相关问题