如何计算两列日期之间的时间,但保持第一个或最早的日期作为参考,按组。例如id
N02
,参考日期_1应保持2009-07-10
,直到下一个id
。我认为我是接近,但我不能成功地找到正确的解决方案。
请在下面找到一个最小工作示例:
id <- c("N02", "N02", "N03", "N03", "N04", "N04", "N04", "N04", "N04", "N04")
date_1 <- c ("2008-03-15", "2008-04-15", "2008-06-15", "2008-07-15", "2009-07-10", "2009-07-13", "2009-07-15", "2009-07-16", "2009-07-17", "2009-07-20")
date_2 <- c ("2008-03-15", "2008-04-15", "2008-06-15", "2008-07-15", "2009-07-10", "2009-07-13", "2009-07-15", "2009-07-16", "2009-07-17", "2009-07-20")
df1 <- data.frame (id, date_1, date_2)
> df1
id date_1 date_2
1 N02 2008-03-15 2008-03-15
2 N02 2008-04-15 2008-04-15
3 N03 2008-06-15 2008-06-15
4 N03 2008-07-15 2008-07-15
5 N04 2009-07-10 2009-07-10
6 N04 2009-07-13 2009-07-13
7 N04 2009-07-15 2009-07-15
8 N04 2009-07-16 2009-07-16
9 N04 2009-07-17 2009-07-17
10 N04 2009-07-20 2009-07-20
我失败的尝试:
df2 <- df1 %>% group_by (id) %>% mutate (diff = difftime (date_2, lag (date_1, default = date_1[1]), unit = "day"))
> df2
# A tibble: 10 × 4
# Groups: id [3]
id date_1 date_2 diff
<chr> <chr> <chr> <drtn>
1 N02 2008-03-15 2008-03-15 0.00000 days
2 N02 2008-04-15 2008-04-15 30.95833 days
3 N03 2008-06-15 2008-06-15 0.00000 days
4 N03 2008-07-15 2008-07-15 30.00000 days
5 N04 2009-07-10 2009-07-10 0.00000 days
6 N04 2009-07-13 2009-07-13 3.00000 days
7 N04 2009-07-15 2009-07-15 2.00000 days
8 N04 2009-07-16 2009-07-16 1.00000 days
9 N04 2009-07-17 2009-07-17 1.00000 days
10 N04 2009-07-20 2009-07-20 3.00000 days
不过我想这样的东西:
id <- c("N02", "N02", "N03", "N03", "N04", "N04", "N04", "N04", "N04", "N04")
date_1 <- c ("2008-03-15", "2008-04-15", "2008-06-15", "2008-07-15", "2009-07-10", "2009-07-13", "2009-07-15", "2009-07-16", "2009-07-17", "2009-07-20")
date_2 <- c ("2008-03-15", "2008-04-15", "2008-06-15", "2008-07-15", "2009-07-10", "2009-07-13", "2009-07-15", "2009-07-16", "2009-07-17", "2009-07-20")
diff <- c("0.00000 days", "30.95833 days", "0.00000 days", "30.00000 days", "0.00000 days", "3.00000 days", "5.00000 days", "6.00000 days", "7.00000 days", "10.0000 days")
df2 <- data.frame (id, date_1, date_2, diff)
> df2
id date_1 date_2 diff
1 N02 2008-03-15 2008-03-15 0.00000 days
2 N02 2008-04-15 2008-04-15 30.95833 days
3 N03 2008-06-15 2008-06-15 0.00000 days
4 N03 2008-07-15 2008-07-15 30.00000 days
5 N04 2009-07-10 2009-07-10 0.00000 days
6 N04 2009-07-13 2009-07-13 3.00000 days
7 N04 2009-07-15 2009-07-15 5.00000 days
8 N04 2009-07-16 2009-07-16 6.00000 days
9 N04 2009-07-17 2009-07-17 7.00000 days
10 N04 2009-07-20 2009-07-20 10.0000 days
提前感谢你的帮助。查尔斯
1条答案
按热度按时间elcex8rz1#
您几乎已经做到了-只需使用
[[1]]
(或dplyr::first()
)而不是lag()
:第一个