R语言 如何使用值标签格式化数据集以允许对另一个数据集中的变量进行重新编码

uxh89sit  于 2023-01-22  发布在  其他
关注(0)|答案(1)|浏览(152)

我有一个数据集需要使用另一个数据集中的标签重新编码。当我在R中创建标签 Dataframe 时,重新编码工作正常,但当我从csv文件中读取相同的数据时,重新编码工作不正常。
数据:

df <- data.frame(
  gender=c(1,2,1,2),
  condition=c(1,1,2,2)
)

在R中创建的代码(使用本作品):

codes <- data.frame(
  gender_values= c("1", "2"),
  gender_labels= c("male gender","female gender"),
  condition_values = c("1", "2"),
  condition_labels = c("exp condition 1", "exp condition 2")
)

这是可行的:

df$gender <- dplyr::recode(df$gender, !!!codes$gender_labels, .default = "nothing")
> df
         gender condition
1   male gender         1
2 female gender         1
3   male gender         2
4 female gender         2

使用csv代码时,代码不起作用:

> dput(codes_csv)
structure(list(gender_values = "\"1\",\"2\"", gender_labels = "\"male gender\", \"female gender\"", 
    condition_values = "\"1\",\"2\"", condition_labels = "\"exp condition 1\", \"exp condition 2\""), class = "data.frame", row.names = c(NA, 
-1L))

df$gender <- dplyr::recode(df$gender, !!!codes_csv$gender_labels, .default = "nothing")

> df
                          gender condition
1 "male gender", "female gender"         1
2                        nothing         1
3 "male gender", "female gender"         2
4                        nothing         2

我如何格式化csv文件中的单元格以使重新编码工作?

zwghvu4y

zwghvu4y1#

codes_csv <- structure(list(gender_values = "\"1\",\"2\"", gender_labels = "\"male gender\", \"female gender\"", 
    condition_values = "\"1\",\"2\"", condition_labels = "\"exp condition 1\", \"exp condition 2\""), class = "data.frame", row.names = c(NA, 
-1L))

codes_csv2 <- as.data.frame(lapply(codes_csv, function(x) 
      unlist(read.delim(textConnection(x), quote = "\"",sep = ",", 
      strip.white = T, header = F))))

codes_csv2
#>    gender_values gender_labels condition_values condition_labels
#> V1             1   male gender                1  exp condition 1
#> V2             2 female gender                2  exp condition 2

相关问题