循环输出多个交叉表,但按列添加表,从而在R中生成一个表

vktxenjb  于 2023-04-03  发布在  其他
关注(0)|答案(1)|浏览(99)

我有一个循环,可以为一列创建多个交叉表,但不是按顺序打印它们,而是需要按列添加它们,就像我们使用cbind()一样,因为行数是相同的(因为交叉表是df中一列对其他列)。
以下是现有问题中的数据示例:How to apply for loop function into crosstab making in R

test1 <- structure(list(weight = c(0.2158, 0.799, 0.611, 0.4969, 0.3469, 
1.0107, 0.6946, 0.9415, 1.4008, 0.6192), Q2_1 = structure(c(4, 
4, 2, 2, 3, 3, 3, 2, 3, 2), label = "How worried, if at all, are you about each of the following? - You or someone in your family will get sick with COVID-19", format.spss = "F40.0", display_width = 5L, labels = c(Skipped = -1, 
`Very worried` = 1, `Somewhat worried` = 2, `Not too worried` = 3, 
`Not at all worried` = 4), class = c("haven_labelled", "vctrs_vctr", 
"double")), Q2_2 = structure(c(3, 4, 2, 4, 3, 3, 4, 2, 3, 4), label = "How worried, if at all, are you about each of the following? - You might experience serious side effects from the COVID-19 vaccine", format.spss = "F40.0", display_width = 5L, labels = c(Skipped = -1, 
`Very worried` = 1, `Somewhat worried` = 2, `Not too worried` = 3, 
`Not at all worried` = 4), class = c("haven_labelled", "vctrs_vctr", 
"double")), group = c("E", "E", "E", "D", "E", "E", "D", "E", 
"D", "E")), row.names = c(NA, -10L), class = "data.frame")

下面是按顺序输出多个交叉表的函数:

library(pollster)
library(rlang)

for (i in colnames(test1)[2:3]) {
  table <- crosstab(
    df = test1, 
    x = !!sym(i), 
    y = group, 
    weight = weight, 
    pct_type = "column")
  print(table)
}

 Q2_1                   D     E
  <chr>              <dbl> <dbl>
1 Somewhat worried   19.2  47.8 
2 Not too worried    80.8  29.9 
3 Not at all worried  0    22.3 
4 n                   2.59  4.54
# A tibble: 4 × 3
  Q2_2                   D     E
  <chr>              <dbl> <dbl>
1 Somewhat worried    0    34.2 
2 Not too worried    54.0  34.6 
3 Not at all worried 46.0  31.2 
4 n                   2.59  4.54

我不想按顺序打印它们,而是想让它们像命令cbind()那样,在一个表中按列彼此连接,这样,我将有一个表,其中有相同的行和所有列,这些行和列是从两个交叉表中组合出来的。

D     E       D     E
  <chr>              <dbl> <dbl>   <dbl> <dbl>
1 Somewhat worried   19.2  47.8     0    34.2
2 Not too worried    80.8  29.9    54.0  34.6
3 Not at all worried  0    22.3    46.0  31.2
4 n                   2.59  4.54   2.59  4.54
6kkfgxo0

6kkfgxo01#

您期望的输出为不同的列使用相同的名称。这通常是一个坏主意,所以您可能应该调整列名。
因此,在您的情况下,您可以使用以下方法之一:

my_list <- list()

for (i in colnames(test1)[2:3]) {
  my_list[[i]] <- crosstab(
    df = test1, 
    x = !!sym(i), 
    y = group, 
    weight = weight, 
    pct_type = "column")
  
  colnames(my_list[[i]])[1] <- "index"
  
}

首先,我们将交叉表结果存储在list中,并为第一列分配一个常量列名。
现在你可以用

library(dplyr)

bind_cols(my_list)

得到

# A tibble: 4 × 6
  index...1          D...2 E...3 index...4          D...5 E...6
  <chr>              <dbl> <dbl> <chr>              <dbl> <dbl>
1 Somewhat worried   19.2  47.8  Somewhat worried    0    34.2 
2 Not too worried    80.8  29.9  Not too worried    54.0  34.6 
3 Not at all worried  0    22.3  Not at all worried 46.0  31.2 
4 n                   2.59  4.54 n                   2.59  4.54

正如您所看到的,列名被更改为唯一的。您可以使用.name_repair = "minimal"参数跳过此操作。

# > bind_cols(my_list, .name_repair = "minimal")
# A tibble: 4 × 6
  index                  D     E index                  D     E
  <chr>              <dbl> <dbl> <chr>              <dbl> <dbl>
1 Somewhat worried   19.2  47.8  Somewhat worried    0    34.2 
2 Not too worried    80.8  29.9  Not too worried    54.0  34.6 
3 Not at all worried  0    22.3  Not at all worried 46.0  31.2 
4 n                   2.59  4.54 n                   2.59  4.54

另一种选择是将purrr包与left_join结合使用

my_list %>% 
  reduce(left_join, by = "index")

这将返回

# A tibble: 4 × 5
  index                D.x   E.x   D.y   E.y
  <chr>              <dbl> <dbl> <dbl> <dbl>
1 Somewhat worried   19.2  47.8   0    34.2 
2 Not too worried    80.8  29.9  54.0  34.6 
3 Not at all worried  0    22.3  46.0  31.2 
4 n                   2.59  4.54  2.59  4.54

您可以通过left_join.suffix参数调整列名。默认值为xy,结果为D.xD.y

相关问题