R语言 按组基于两个不同列中的值创建新列

tv6aics1  于 2023-03-05  发布在  其他
关注(0)|答案(4)|浏览(119)

我想为"sp.name"(分组变量)创建一个"new column",其中"number"中的值同时存在于"young"和"adult"中;如果没有,则在"新列"中输入0。

df <- data.frame(sp.name= c('a','a', 'b', 'b' ,'c', 'd' ),
                 number=c(2,2,3, 3,4,4),
                 stage= c('adult', 'young', 'young','adult', 'adult', 'young'))

这是我尝试过的。

df %>%
  group_by(sp.name) %>%
  mutate(new_column = ifelse('young' %in% stage & 'adult' %in% stage, 
                                             number[stage == 'adult'], 0))

但是我的代码也将"young"的值复制到新列,我只需要与"adult"相关的值。
预期输出:
| 姓名|数|级|新建列|
| - ------|- ------|- ------|- ------|
| 项目a|第二章|成虫|第二章|
| 项目a|第二章|青年|无|
| b.人口基金|三个|青年|无|
| b.人口基金|三个|成虫|三个|
| (c)秘书长的报告|四个|成虫|无|
| 日|四个|青年|无|

1bqhqjot

1bqhqjot1#

df %>% 
  group_by(sp.name) %>% 
  mutate(new = (any(stage == 'adult') & any(stage == 'young') & stage == 'adult') * number)

#> # A tibble: 6 x 4
#> # Groups:   sp.name [4]
#>   sp.name number stage   new
#>   <chr>    <dbl> <chr> <dbl>
#> 1 a            2 adult     2
#> 2 a            2 young     0
#> 3 b            3 young     0
#> 4 b            3 adult     3
#> 5 c            4 adult     0
#> 6 d            4 young     0
e5njpo68

e5njpo682#

使用ifelse条件:

df %>% 
mutate(new_column = ifelse( sp.name == "a" & stage %in% c("young", "adult"), number , 0))

  sp.name number stage new_column
1       a      2 adult          2
2       a      2 young          2
3       b      3 young          0
4       b      3 adult          0
5       c      4 adult          0
6       d      4 young          0
krcsximq

krcsximq3#

您需要具有stage == "young""adult"(组级条件)以及stage == "adult"(行级条件)的any组:

df %>%
  group_by(sp.name) %>%
  mutate(new_column = ifelse(any(stage == "young") & any(stage == "adult") & stage == "adult", 
                             number[stage == 'adult'], 0))

  sp.name number stage new_column
1 a            2 adult          2
2 a            2 young          0
3 b            3 young          0
4 b            3 adult          3
5 c            4 adult          0
6 d            4 young          0
hs1rzwqc

hs1rzwqc4#

使用data.table

library(data.table)
setDT(df)[, new_column := number *(all(c("young", "adult") %chin% stage) & 
    stage == "adult"), sp.name]
  • 输出
> df
   sp.name number stage new_column
1:       a      2 adult          2
2:       a      2 young          0
3:       b      3 young          0
4:       b      3 adult          3
5:       c      4 adult          0
6:       d      4 young          0

相关问题