R语言 连接两个数据集,仅保留日期紧挨日期之前的行

szqfcxe2  于 2023-05-20  发布在  其他
关注(0)|答案(1)|浏览(89)

我有一个数据集,记录了患者接受手术的日期

data1<-tribble(
  ~MRN, ~date_in_room,
  1, "2015-01-02"
) %>% mutate(date_in_room = as.Date(date_in_room))

以及具有在手术前几天内进行的大量血压测量的第二数据集。有三种不同的血压测量:SBP、DBP,平均值:

library(dplyr)

data2<-tribble(
  ~MRN, ~event, ~date, ~value,
  1, "sbp", "2014-12-31", 120,
  1, "sbp", "2015-01-01", 119,
  1, "sbp", "2015-01-02", 125,
  1, "dbp", "2015-01-01", 80,
  1, "dbp", "2015-01-02", 84,
  1, "mean", "2015-01-01", 100
)%>% mutate(date = as.Date(date))

我想加入这两个数据集,但只保留每种类型的血压测量值,从最接近 * 之前 * 的日期到手术日期(通常是前一天,可能更多)。

desired<-tribble(
  ~mrn, ~event, ~date, ~value, ~date_in_room,
  1, "sbp", "2015-01-01", 119, "2015-01-02",
  1, "dbp", "2015-01-01", 80, "2015-01-02",
  1, "mean", "2015-01-01", 100, "2015-01-02"
) %>% mutate(date_in_room = as.Date(date_in_room),
date=as.Date(date)

我想大概是

data1 |> 
  group_by(MRN, event) |> 
  left_join(data2, by=MRN) |>
...?

谢谢你的任何想法!

q8l4jmvw

q8l4jmvw1#

您可以在join_by()中使用closest()辅助函数。有关详细信息,请参阅documentation

library(dplyr)

left_join(data1, data2, join_by(MRN, closest(date_in_room > date)))

# A tibble: 3 × 5
    MRN date_in_room event date       value
  <dbl> <date>       <chr> <date>     <dbl>
1     1 2015-01-02   sbp   2015-01-01   119
2     1 2015-01-02   dbp   2015-01-01    80
3     1 2015-01-02   mean  2015-01-01   100

数据

我修改了您的数据,使date列属于date类。

data1<-tribble(
  ~MRN, ~date_in_room,
  1, "2015-01-02"
) %>% mutate(date_in_room = as.Date(date_in_room))

data2<-tribble(
  ~MRN, ~event, ~date, ~value,
  1, "sbp", "2014-12-31", 120,
  1, "sbp", "2015-01-01", 119,
  1, "sbp", "2015-01-02", 125,
  1, "dbp", "2015-01-01", 80,
  1, "dbp", "2015-01-02", 84,
  1, "mean", "2015-01-01", 100
) %>% mutate(date = as.Date(date))

相关问题