根据R组中其他变量的结果创建新变量

dxpyg8gm  于 2023-02-20  发布在  其他
关注(0)|答案(3)|浏览(159)

这是一个与R: How to code new variable based on grouped variable and conditioned on earlier row * 类似/跟进的问题,但不同之处在于供体内可能存在两次匹配运行。*
我有一个器官捐赠者的数据文件。我在看捐赠的肺-有两个肺。
如果肺被分开(左肺和右肺)并准备捐献,它们将分别与受体匹配(“matchrun”),然后经过合格的受体,直到有一个匹配(“sequence”)。
如果肺与受体匹配,则将其发送给受体(“器官放置”)。
如果肺部不匹配,则它继续在序列中,然后在最大序列号处保持NA。
我想创建一个包含匹配运行结果的新变量,以便如果放置了一个肺,而另一个未放置,它会告诉您该肺被丢弃。即,参见数据中供体2的情况-放置了左肺,但右肺不匹配。
在供体3中,第一次匹配运行不匹配,但另一个肺的匹配运行匹配。
我认为它应该是类似group_by(donorid,matchrun)的东西,但是如何基于match run创建条件呢?

library(tribble)
library(dplyr)

data <- tribble(
  ~donorid, ~matchrun, ~sequence, ~organ_placed,
    2, 3, 1, NA,
  2, 3, 2, NA,
  2, 3, 3, "L",
  2, 4, 1, NA,
  2, 4, 2, NA,
  2, 4, 3, NA,
  3, 5, 1, NA,
  3, 5, 1, NA,
  3, 5, 1, NA,
  3, 6, 1, NA,
  3, 6, 2, NA,
  3, 6, 3, "L"
)

desired_outcome <- tribble(
  ~donorid, ~matchrun, ~sequence, ~organ_placed, ~organ,
  2, 3, 1, NA, NA, 
  2, 3, 2, NA, NA, 
  2, 3, 3, "L", "Left Single",
  2, 4, 1, NA, NA,
  2, 4, 2, NA, NA, 
  2, 4, 3, NA, "Right Discarded",
  3, 5, 1, NA, NA,
  3, 5, 1, NA, NA,
  3, 5, 1, NA, "Right Discarded",
  3, 6, 1, NA, NA,
  3, 6, 2, NA, NA,
  3, 6, 3, "L", "Left Single")
zyfwsgd6

zyfwsgd61#

你可以试试这个:

data %>% 
  group_by(donorid) %>% 
  mutate(temp = ifelse(n_distinct(organ_placed, na.rm = TRUE) == 1, unique(na.omit(organ_placed)), "B")) %>% 
  group_by(matchrun, .add = TRUE) %>% 
  mutate(organ = case_when(organ_placed == "L" ~ "Left Single",
                           organ_placed == "R" ~ "Right Single",
                           all(is.na(organ_placed)) & row_number() == max(sequence) & temp == "L" ~ "Right Discarded", 
                           all(is.na(organ_placed)) & row_number() == max(sequence) & temp == "R" ~ "Left Discarded")) %>%
  ungroup()

输出

donorid matchrun sequence organ_placed temp  organ       
 1       1        1        1 NA           B     NA          
 2       1        1        2 NA           B     NA          
 3       1        1        3 L            B     Left Single 
 4       1        2        1 NA           B     NA          
 5       1        2        2 NA           B     NA          
 6       1        2        3 R            B     Right Single
 7       2        3        1 NA           L     NA          
 8       2        3        2 NA           L     NA          
 9       2        3        3 L            L     Left Single 
10       2        4        1 NA           L     NA          
11       2        4        2 NA           L     NA          
12       2        4        3 NA           L     Right Discarded
aurhwmvo

aurhwmvo2#

更新:我们必须将matchrun添加到组中。删除之前的解决方案:

data %>% 
  group_by(donorid, matchrun) %>% 
  mutate(outcome = case_when(organ_placed == "L" ~ "Left Single",
                             organ_placed == "R" ~ "Right Single",
                             organ_placed == "B" ~ "Bilateral",
                             (is.na(organ_placed) & 
                                row_number() == max(row_number())) & 
                               "L" %in% organ_placed ~ "Right Discarded",
                             (is.na(organ_placed) & 
                                row_number() == max(row_number())) & 
                               "R" %in% organ_placed ~ "Left Discarded",
                             TRUE ~ NA_character_))
Groups:   donorid, matchrun [4]
   donorid matchrun sequence organ_placed outcome    
     <dbl>    <dbl>    <dbl> <chr>        <chr>      
 1       2        3        1 NA           NA         
 2       2        3        2 NA           NA         
 3       2        3        3 L            Left Single
 4       2        4        1 NA           NA         
 5       2        4        2 NA           NA         
 6       2        4        3 NA           NA         
 7       3        5        1 NA           NA         
 8       3        5        1 NA           NA         
 9       3        5        1 NA           NA         
10       3        6        1 NA           NA         
11       3        6        2 NA           NA         
12       3        6        3 L            Left Single
rdlzhqv9

rdlzhqv93#

我们可以用

library(data.table)
library(stringr)
setDT(data)[, seq2 := rowid(donorid, matchrun) ]
data[, organ := str_replace_all(organ_placed,
   setNames(c("Left Single", "Right Single"), c("L", "R")))]
 data[seq2 == max(seq2), 
  organ := fcase(!is.na(organ), organ, default = 
  str_replace_all(setdiff(c("Left Single", "Right Single"), organ), 
   setNames(c("Left Discarded", "Right Discarded"),
   c("Left Single", "Right Single")))), donorid
  ][, seq2 := NULL][]
  • 输出
> data
    donorid matchrun sequence organ_placed           organ
 1:       2        3        1         <NA>            <NA>
 2:       2        3        2         <NA>            <NA>
 3:       2        3        3            L     Left Single
 4:       2        4        1         <NA>            <NA>
 5:       2        4        2         <NA>            <NA>
 6:       2        4        3         <NA> Right Discarded
 7:       3        5        1         <NA>            <NA>
 8:       3        5        1         <NA>            <NA>
 9:       3        5        1         <NA> Right Discarded
10:       3        6        1         <NA>            <NA>
11:       3        6        2         <NA>            <NA>
12:       3        6        3            L     Left Single

相关问题