R语言 按组过滤两列NA病例

dluptydi  于 2023-05-11  发布在  其他
关注(0)|答案(3)|浏览(100)
df <- data.frame(x=c("s1","s1","s1","s1","s2","s2","s2","s2","s3","s3","s3","s3"), y=c("g1","g2","g3","g4","g1","g2","g3","g4","g1","g2","g3","g4"), z1=c(1,2,3,2,4,5,6,6,3,2,4,NA), z2=c(NA,1,2,3,1,2,3,1,1,2,1,NA))

我想只选择那些不包含NA的组。我以为这会奏效:

columns <- c("z1", "z2")
df %>% group_by(x) %>% filter(all(!is.na(!!columns)))

但它似乎没有过滤

4uqofj5v

4uqofj5v1#

一个选项是使用across

library(dplyr, warn = FALSE)

columns <- c("z1", "z2")

df %>%
  group_by(x) %>%
  filter(all(across(all_of(columns), ~ !is.na(.x))))

#> # A tibble: 4 × 4
#> # Groups:   x [1]
#>   x     y        z1    z2
#>   <chr> <chr> <dbl> <dbl>
#> 1 s2    g1        4     1
#> 2 s2    g2        5     2
#> 3 s2    g3        6     3
#> 4 s2    g4        6     1
fkvaft9z

fkvaft9z2#

一种在filter中使用if_all的方法,带有 * 列 *

library(dplyr)

columns <- c("z1", "z2")

df %>% 
  group_by(x) %>% 
  filter(if_all(!!columns, ~ all(!is.na(.x))))
# A tibble: 4 × 4
# Groups:   x [1]
  x     y        z1    z2
  <chr> <chr> <dbl> <dbl>
1 s2    g1        4     1
2 s2    g2        5     2
3 s2    g3        6     3
4 s2    g4        6     1

或使用 tidyselectmatches

library(dplyr)

df %>% 
  group_by(x) %>% 
  filter(if_all(matches("z[12]"), ~ all(!is.na(.x))))
# A tibble: 4 × 4
# Groups:   x [1]
  x     y        z1    z2
  <chr> <chr> <dbl> <dbl>
1 s2    g1        4     1
2 s2    g2        5     2
3 s2    g3        6     3
4 s2    g4        6     1
owfi6suc

owfi6suc3#

使用sjmisc包

library(sjmisc)
df %>% group_by(x) %>% row_count(., count = NA) %>% #count NA in each row
add_count(wt=rowcount) %>% #sum for each group
filter(n==0)  #filter out groups with NA

相关问题