R按最近日期合并两个 Dataframe [重复]

t5fffqht 于 2023-04-27 发布在其他

关注(0)|答案(1)|浏览(124)

此问题已在此处有答案：

Merge nearest date, and related variables from a another dataframe by group（4个答案）
Merge data based on nearest date R（2个答案）
7天前关闭
我有两个大的 Dataframe ，dfA和dfB，我在这里生成了它们的简单示例

dfA = data.frame(id=c("Apple", "Banana", "Carrot", "Dates", "Egg"),
                    Answer_Date=as.Date(c("2013-12-07", "2014-12-07", "2015-12-07", "2016-12-07", "2017-12-07" )),
                    x1 = c(1,  2,  3,  4,  5),
                    x2 = c(10, 20, 30, 40, 50))

    Browse[2]> dfA
      id Answer_Date x1 x2
1  Apple  2013-12-07  1 10
2 Banana  2014-12-07  2 20
3 Carrot  2015-12-07  3 30
4  Dates  2016-12-07  4 40
5    Egg  2017-12-07  5 50

dfB = data.frame(id=c("Apple", "Apple", "Banana", "Banana", "Banana"),
                    Answer_Date=as.Date(c("2013-12-05", "2014-12-07", "2015-12-10", "2018-11-07", "2019-11-07" )),
                    x3 = c(5,  4,  3,  2,  1),
                    x4 = c(50, 40, 30, 20, 10))
Browse[2]> dfB
      id Answer_Date x3 x4
1  Apple  2013-12-05  5 50
2  Apple  2014-12-07  4 40
3 Banana  2014-12-10  3 30
4 Banana  2018-11-07  2 20
5 Banana  2019-11-07  1 10

我想按最接近的日期合并它们，这样我就可以得到dfA和dfB中存在的项目，这些项目按id * 精确 * 匹配，并按Answer_Date * 尽可能接近 * 匹配（即两个日期之间日期差的最小绝对值）。

dfC
      id Answer_Date.x Answer_Date.y x1 x2 x3 x4
1  Apple    2013-12-07    2013-12-05  1 10  5 50
2 Banana    2014-12-07    2014-12-10  2 20  3 30

不幸的是，我一直在努力使用merge（），并尝试了在StackOverflow上找到的各种解决方案，但这并没有解决我的问题，只让我感到困惑。有人能好心地告诉我正确的解决方案吗？最好能简单地解释一下为什么它有效？
真诚的，并预先表示感谢
托马斯·菲利普斯

r

来源：https://stackoverflow.com/questions/63751172/r-merge-two-dataframes-by-closest-date

1条答案

按热度按时间

fwzugrvs1#

左连接dfB到dfA，取每行日期之间的差值，并选择每个id的最小差值。

left_join(dfA, dfB, by = "id") %>%
  mutate(date_diff = abs(Answer_Date.x - Answer_Date.y)) %>%
  group_by(id) %>%
  filter(date_diff == min(date_diff)) %>%
  select(id, Answer_Date.x, Answer_Date.y, starts_with("x"), date_diff)

则输出为：

# A tibble: 2 x 8
# Groups:   id [2]
  id     Answer_Date.x Answer_Date.y    x1    x2    x3    x4 date_diff
  <fct>  <date>        <date>        <dbl> <dbl> <dbl> <dbl> <drtn>   
1 Apple  2013-12-07    2013-12-05        1    10     5    50 2 days   
2 Banana 2014-12-07    2014-12-10        2    20     3    30 3 days

顺便说一下，在示例代码中，dfB定义中的第三个Answer_Date应该是"2014-12-10"，而不是"2015-12-10"。

赞(0）回复(0）举报 2023-04-27

我来回答

R按最近日期合并两个 Dataframe [重复]

1条答案

相关问题

热门标签

最新问答