R语言如何在多个变量之间执行卡方检验并创建结果的数据框架？

7vux5j2d 于 2023-01-03 发布在其他

关注(0)|答案(2)|浏览(236)

总体而言，我对R和数据分析还是个新手。我有一个包含2个部分的数据集：

20个问题（答案采用5分制李克特量表格式）
8个社会人口学变量
以下是按比例缩小的数据集样本版本（仅包含20个问题中的3个和3个社会人口统计学变量），以备需要：

data.frame(Q1 = c(1, 2, 2, 1, 3, 4, 3, 5, 2, 2),
           Q2 = c(2, 3, 5, 5, 4, 5, 1, 1, 5, 3),
           Q3 = c(4, 4, 2, 3, 2, 1, 1, 1, 5, 5), 
           ageRange = c(2, 3, 1, 1, 3, 4, 4, 2, 1, 1),
           education = c(1, 1, 3, 4, 6, 5, 3, 2, 1, 4),
           maritalStatus = c(1, 0, 0, 0, 0, 1, 1, 0, 0, 1))

1.我需要应用卡方检验，将每个问题与所有社会人口统计学变量联系起来，总共有9个卡方结果：Q1 -年龄范围、Q1 -教育程度、Q1 -婚姻状况、Q2 -年龄范围、Q2 -教育程度、Q2 -婚姻状况、Q3 -年龄范围、Q3 -教育程度、Q3 -婚姻状况
1.我想把卡方配对的结果排列成一个数据框架或矩阵，其中列是3个社会人口统计学因素，行是3个问题，它应该看起来像这样（只需将每个0替换为每个行列对对应的p值）：

data.frame(Age = c(0, 0, 0),
           Education = c(0, 0, 0), 
           Married = c(0, 0, 0), row.names = c("Q1", "Q2", "Q3"))

我试着使用一些应用函数，但无法让它工作。

来源：https://stackoverflow.com/questions/74985011/how-do-i-perform-chi-square-tests-between-many-variables-and-create-a-data-frame

2条答案

按热度按时间

wwtsj6pe1#

我们可以这样做。这相当冗长，但对于开始来说可能会有帮助：
我们在这里所做的原则上是用Q列中的每一列和其他列创建新的 Dataframe ，对于每一列，我们做同样的事情，并在最后绑定它们。
broom包中的tidy函数非常方便：

library(dplyr)
library(tidyr)
library(broom)

Q1 <- df %>% 
  select(-Q2, -Q3) %>% 
  pivot_longer(-Q1) %>% 
  group_by(name) %>% 
  nest(-name) %>% 
  mutate(stats = map(data, ~broom::tidy(chisq.test(.$Q1, .$value)))) %>% 
  select(-data) %>% 
  unnest(c(stats))

Q2 <- df %>% 
  select(-Q1, -Q3) %>% 
  pivot_longer(-Q2) %>% 
  group_by(name) %>% 
  nest(-name) %>% 
  mutate(stats = map(data, ~broom::tidy(chisq.test(.$Q2, .$value)))) %>% 
  select(-data) %>% 
  unnest(c(stats))

Q3 <- df %>% 
  select(-Q1, -Q2) %>% 
  pivot_longer(-Q3) %>% 
  group_by(name) %>% 
  nest(-name) %>% 
  mutate(stats = map(data, ~broom::tidy(chisq.test(.$Q3, .$value)))) %>% 
  select(-data) %>% 
  unnest(c(stats))

bind_rows(Q1, Q2, Q3, .id = "Q") %>% 
mutate(ID = paste0("Q",Q), .before=1, .keep="unused")

ID    name          statistic p.value parameter method                    
  <chr> <chr>             <dbl>   <dbl>     <int> <chr>                     
1 Q1    ageRange          15.6    0.209        12 Pearson's Chi-squared test
2 Q1    education         27.5    0.122        20 Pearson's Chi-squared test
3 Q1    maritalStatus      2.71   0.608         4 Pearson's Chi-squared test
4 Q2    ageRange          15.6    0.209        12 Pearson's Chi-squared test
5 Q2    education         20.8    0.407        20 Pearson's Chi-squared test
6 Q2    maritalStatus      2.71   0.608         4 Pearson's Chi-squared test
7 Q3    ageRange          14.6    0.265        12 Pearson's Chi-squared test
8 Q3    education         21.7    0.359        20 Pearson's Chi-squared test
9 Q3    maritalStatus      3.06   0.549         4 Pearson's Chi-squared test

赞(0）回复(0）举报 2023-01-03

kx1ctssn2#

我们也可以使用循环

library(purrr)
library(broom)
library(tidyr)
library(stringr)
library(dplyr)
str_subset(names(df), "^Q\\d+$") %>%
   map(~ df %>% 
    select(all_of(.x), ageRange:maritalStatus) %>%
    pivot_longer(cols = -1) %>% 
   group_by(ID = .x, name) %>% 
   summarise(stats = tidy(chisq.test(cur_data()[[1]], value)),
       .groups = "drop")) %>% 
   list_rbind %>%
   unnest(where(is_tibble))

输出

# A tibble: 9 × 6
  ID    name          statistic p.value parameter method                    
  <chr> <chr>             <dbl>   <dbl>     <int> <chr>                     
1 Q1    ageRange          15.6    0.209        12 Pearson's Chi-squared test
2 Q1    education         27.5    0.122        20 Pearson's Chi-squared test
3 Q1    maritalStatus      2.71   0.608         4 Pearson's Chi-squared test
4 Q2    ageRange          15.6    0.209        12 Pearson's Chi-squared test
5 Q2    education         20.8    0.407        20 Pearson's Chi-squared test
6 Q2    maritalStatus      2.71   0.608         4 Pearson's Chi-squared test
7 Q3    ageRange          14.6    0.265        12 Pearson's Chi-squared test
8 Q3    education         21.7    0.359        20 Pearson's Chi-squared test
9 Q3    maritalStatus      3.06   0.549         4 Pearson's Chi-squared test

赞(0）回复(0）举报 2023-01-03

我来回答

R语言如何在多个变量之间执行卡方检验并创建结果的数据框架？

2条答案

相关问题

热门标签

最新问答

R语言 如何在多个变量之间执行卡方检验并创建结果的数据框架？

2条答案

相关问题

热门标签

最新问答

R语言如何在多个变量之间执行卡方检验并创建结果的数据框架？