我的目标是用更少的时间做卡方检验。
data <- data.frame(
sex = factor(c("M", "F", "M")),
ageid = factor(c(8, 6, 7)),
married = factor(c(2, 1, 2)),
cagv_typ = factor(c("non-primary", "primary", "non-primary")),
sq5_1 = factor(c(1, 1, 1)),
sq5_2 = factor(c(0, 1, 0))
)
其中,性别和已婚是变量,其余是结果。实际上,我有超过10个结果变量和5个亚组变量。
首先,我根据这里显示的示例编写了以下代码https://epirhandbook.com/en/simple-statistical-tests.html#chi-squared-test-1
library(rstatix)
chis_test <- function(data, var1, var2){
result <- data %>%
tabyl({{var1}}, {{var2}}) %>%
select(-1) %>%
chisq_test()
return(result)
}
接下来,我尝试使用expand_grid()获取所有可能的组合:
combo <- expand_grid(x = names(data)[c(1, 3)], y = names(data)[-c(1, 3)])
结果如下(其他实际变量也显示):
x y
1 cagv_typ ageid
2 sex ageid
3 cset_typ_bi ageid
4 lv_eas_bi ageid
5 und_con_bi ageid
6 sup_ard_bi ageid
7 job_inf_bi ageid
8 cagv_typ married
9 sex married
10 cset_typ_bi married
11 lv_eas_bi married
12 und_con_bi married
13 sup_ard_bi married
14 job_inf_bi married
我还尝试了sex和cagv_tpy的一个组合:
chis_test(sq_catvar, sex, cagv_typ)
它返回了我想要的结果:
n statistic p df method p.signif
267 55.8 7.87e-14 1 Chi-square test ****
但是当我使用apply()时,它失败了:
apply(combo, 1, function(x) chis_test(data, x[1], x[2]))
我想知道出了什么问题。在此先谢谢您!
良好祝愿
1条答案
按热度按时间dl5txlt91#
除了@Onyambu的评论之外,这里还有一个tidyverse方法(可能更容易理解):
编辑:你要求一种很好地提取p值的方法。为了做到这一点,我们可以保存
map2()
的结果并使用sapply()
或map_dbl()
。导致: