R语言如何将纵向数据集旋转得更长

我有一个纵向数据集，我通过使用个人标识符列合并不同的数据集创建。数据集列的顺序是个人标识符，a_sex，a_countryofbirth，a_health，a_educationstatus，b_sex，b_countryofbirth，b_health，b_educationstatus，c_sex，c_countryofbirth，c_health，c_educationstatus等一直到l。所有以a_开头的变量表示第一波，以b开头的变量表示第二波，以此类推-
我尝试使用Pivot longer创建一个名为Wave的新变量，以便我的表看起来像：-

表：InPreg_transformed
**

- Person ID    Wave  Sex CountryofBirth Health

我在其他代码中使用了这个代码，但它确实起作用了。

InPreg_transformed<- InPregDF %>%

  pivot_longer(cols = contains("_"),

               names_to = c("_value", "Wave"),

               names_pattern = "(_+)"

我使用的其他代码：-

InPreg_transformed<- InPreg %>%

          pivot_longer(cols = contains("."), names_to = c(".value", 
          "Wave"), names_pattern = "(.+).(.+)")

 summary(InPreg_transformed)

请协助

为了确保我理解正确，我创建了一个随机的无意义的n“individuals”示例。
首先加载库：

library(tibble)
library(dplyr)
library(tidyr)

然后创建数据集：

sex <- c("Male", "Female")
europe <- c("Belarus", "Belgium", "Bulgaria",
            "Croatia", "CzechRepublic", "Estonia", "France", 
            "Germany", "Hungary", "Ireland", "Italia", "Latvia", "Lithuania", 
            "Luxembourg", "Netherlands", "Poland", "Portugal", "Romania", 
            "Slovakia", "Slovenia", "Spain")
health <- c("Excellent", "Good", "Fair", "Poor")
education <- c("High School", "Bachelor's", "Master's", "PhD")
n <- 10

wdat <- tibble(
  ID = sprintf("Ind%02i", 1:n), # IDs
  a_sex = sample(sex, n, replace = TRUE),
  a_countryofbirth = sample(europe, n, replace = TRUE),
  a_health = sample(health, n, replace = TRUE),
  a_educationstatus = sample(education, 10, replace = TRUE),
  b_sex = sample(sex, n, replace = TRUE),
  b_countryofbirth = sample(europe, n, replace = TRUE),
  b_health = sample(health, n, replace = TRUE),
  b_educationstatus = sample(education, 10, replace = TRUE),
  c_sex = sample(sex, n, replace = TRUE),
  c_countryofbirth = sample(europe, n, replace = TRUE),
  c_health = sample(health, n, replace = TRUE),
  c_educationstatus = sample(education, 10, replace = TRUE))

数据wdat包含每个人的唯一ID，然后是三个列块。
对于这些数据，可以使用pivot_longer函数将其转换为“long”格式，如下所示

wdat %>% 
  pivot_longer(
    -ID,
    names_to = c("wave", ".value"),
    names_pattern = "(.)_(.*)"
    )

哪里

names_pattern = "(.)_(.*)"意味着列名中有两条重要的信息，首先是一个字符，然后是一个字符串，用_分隔，
names_to = c("wave", ".value")意味着单个字符将放在名为wave的列中，宽列中的值将放在名称基于通用模式的列中，例如，a_sex、b_sex和c_sex中的所有值将放在名为sex的列中

编辑：在这种情况下使用names_sep要容易得多

wdat %>% 
  pivot_longer(
    -ID,
    names_sep = "_",
    names_to = c("wave", ".value")
  )

R语言如何将纵向数据集旋转得更长

1条答案

相关问题

热门标签

最新问答

R语言 如何将纵向数据集旋转得更长

1条答案

相关问题

热门标签

最新问答

R语言如何将纵向数据集旋转得更长