在R语言中,如何构造一个变量来指示值何时出现

4zcjmb1e  于 2023-05-04  发布在  R语言
关注(0)|答案(3)|浏览(239)

我想在面板数据集中的“df2”中创建一个变量“qualification_first”。
怎么做?

df1 <- data.frame(id = c("Tony"),
                  year = 2015:2020,
                  qualification = c(0, 0, 0, 1, 1, 1))

df2 <- data.frame(id = c("Tony"),
                  year = 2015:2020,
                  qualification_first = c(0, 0, 0, 1, 0, 0))
mftmpeh8

mftmpeh81#

这里有另一种方法,假设qualification总是从01,而不是更高。

library(dplyr)

df1 %>% 
  mutate(qualification_first = if_else(lag(qualification, default = 0) != qualification, 1, 0),
         .by = id)

#>     id year qualification qualification_first
#> 1 Tony 2015             0                   0
#> 2 Tony 2016             0                   0
#> 3 Tony 2017             0                   0
#> 4 Tony 2018             1                   1
#> 5 Tony 2019             1                   0
#> 6 Tony 2020             1                   0

数据来自OP

df1 <- data.frame(id = c("Tony"),
                  year = 2015:2020,
                  qualification = c(0, 0, 0, 1, 1, 1))

创建于2023-04-27带有reprex v2.0.2

ix0qys7i

ix0qys7i2#

试试看

library(dplyr)
df1 %>% 
  group_by(id) %>%
  mutate(qualification_first = +(row_number() == match(1, qualification))) %>%
  ungroup
  • 输出
# A tibble: 6 × 4
  id     year qualification qualification_first
  <chr> <int>         <dbl>               <int>
1 Tony   2015             0                   0
2 Tony   2016             0                   0
3 Tony   2017             0                   0
4 Tony   2018             1                   1
5 Tony   2019             1                   0
6 Tony   2020             1                   0

或者使用duplicated

df1 %>%
   mutate(qualification_first = +(!duplicated(pick(id, 
      qualification)) & qualification == 1))
91zkwejq

91zkwejq3#

您还可以在data.table中使用行号(.I)。

library(data.table)

first <- setDT(df1)[qualification==1,.I[1L], by=id]$V1
df1[,qualification_first := ifelse(.I %in% first, 1,0)][]

输出

Index: <qualification>
        id  year qualification qualification_first
    <char> <int>         <num>               <num>
 1:   Tony  2015             0                   0
 2:   Tony  2016             0                   0
 3:   Tony  2017             0                   0
 4:   Tony  2018             1                   1
 5:   Tony  2019             1                   0
 6:   Tony  2020             1                   0

创建于2023-04-27带有reprex v2.0.2

相关问题