使用purrr::pmap()指定数据框列标签

tpxzln5u  于 2023-01-15  发布在  其他
关注(0)|答案(1)|浏览(212)

我试图为多个数据框的列分配标签。我有超过10个数据框要操作,但这里有一些例子:

df1 = tribble(
  ~a_age, ~a01edu, ~other_vars,
  35, 17, 1,
  41, 14, 2,
  28, 12, 3,
  68, 99, 4
)

df2 = tribble(
  ~b_age, ~b01edu, ~some_vars,
  25, 10, 2,
  52, 8, 1,
  31, 20, 5
)

df3 = tribble(
  ~c_age, ~c01edu,
  55, 16,
  47, 11,
  68, 16,
  36, 6, 
  29, 16
)

每个 Dataframe 都有一些名称相似的列,如a...some_nameb...some_name等。我尝试使用labelled::set_variable_labels()为一个 Dataframe 创建列标签,效果很好。

df1 = df1 |> labelled::set_variable_labels(
  .labels = list("a_age" = "Age",
                 "a01edu" = "Highest education completed")
)

输出:

然后我尝试使用purrr::pmap()一次为所有 Dataframe 分配列标签,但它给了我一个错误。

df_list = list(df1, df2, df3) |> setNames(c("a", "b", "c"))

params = tribble(
  ~x, ~y, ~z,
  "a", "a_age", "a01edu",
  "b", "b_age", "b01edu",
  "c", "c_age", "c01edu"
)

pmap(params,
     function(x, y, z) {
       df_list[[x]] |> labelled::set_variable_labels(
         .labels = list(y = "Age",
                        z = "Highest education completed")
         )
       }
     )

错误消息

<error/rlang_error>
Error in `pmap()`:
ℹ In index: 1.
Caused by error in `var_label<-.data.frame`:
! some variables not found in x:y, z
---
Backtrace:
 1. purrr::pmap(...)
 2. purrr:::pmap_("list", .l, .f, ..., .progress = .progress)
 5. global .f(x = .l[[1L]][[i]], y = .l[[2L]][[i]], z = .l[[3L]][[i]], ...)
 6. labelled::set_variable_labels(...)
 8. labelled:::`var_label<-.data.frame`(`*tmp*`, value = .labels)
 9. base::stop("some variables not found in x:", missing_names)

为什么我会得到这个错误?我以为我正确地设置了params对象,以便df_list中的列名与我输入到函数function(x, y, z)中的列名相匹配。我确信有更好的方法来实现我正在尝试做的事情。任何帮助都将非常感谢。谢谢!

oyxsuwqo

oyxsuwqo1#

只是=不允许对lhs求值,我们可以将:=dplyr::lst一起使用

library(dplyr)
library(purrr)
df_list2 <- pmap(params, ~ df_list[[..1]] |> 
    labelled::set_variable_labels(
         .labels = lst(!!..2 := "Age",
                        !! ..3 := "Highest education completed")
         )
 )
  • 输出
[[1]]
# A tibble: 4 × 3
  a_age a01edu other_vars
  <dbl>  <dbl>      <dbl>
1    35     17          1
2    41     14          2
3    28     12          3
4    68     99          4

[[2]]
# A tibble: 3 × 3
  b_age b01edu some_vars
  <dbl>  <dbl>     <dbl>
1    25     10         2
2    52      8         1
3    31     20         5

[[3]]
# A tibble: 5 × 2
  c_age c01edu
  <dbl>  <dbl>
1    55     16
2    47     11
3    68     16
4    36      6
5    29     16

> str(df_list2)
List of 3
 $ : tibble [4 × 3] (S3: tbl_df/tbl/data.frame)
  ..$ a_age     : num [1:4] 35 41 28 68
  .. ..- attr(*, "label")= chr "Age"
  ..$ a01edu    : num [1:4] 17 14 12 99
  .. ..- attr(*, "label")= chr "Highest education completed"
  ..$ other_vars: num [1:4] 1 2 3 4
 $ : tibble [3 × 3] (S3: tbl_df/tbl/data.frame)
  ..$ b_age    : num [1:3] 25 52 31
  .. ..- attr(*, "label")= chr "Age"
  ..$ b01edu   : num [1:3] 10 8 20
  .. ..- attr(*, "label")= chr "Highest education completed"
  ..$ some_vars: num [1:3] 2 1 5
 $ : tibble [5 × 2] (S3: tbl_df/tbl/data.frame)
  ..$ c_age : num [1:5] 55 47 68 36 29
  .. ..- attr(*, "label")= chr "Age"
  ..$ c01edu: num [1:5] 16 11 16 6 16
  .. ..- attr(*, "label")= chr "Highest education completed"

相关问题