R语言 查找每个受试者ID的分类变量(良好和不良)百分比

pgky5nke  于 2023-02-10  发布在  其他
关注(0)|答案(2)|浏览(151)

当我试图得到百分比时,我得到的数字不正确。例如,从受试者ID下面的数据框中:1有100%的好和0%的坏。但是我的代码没有给予我这个结果。

performance <- tribble(
~SubjectID, ~Outcome,
1, 'good',
1, 'good',
1, 'good',
2, 'good',
2, 'bad',
2, 'good',
3, 'bad',
4, 'good',
4, 'good'
)

# finding the frequency
freq_id <- with(performance, table(SubjectID, Outcome))
view(freq_id)
#finding the percentage and rounding it up by 2
round(prop.table(freq_id)*100,2)
3z6pesqy

3z6pesqy1#

使用dplyr包中的group_by()函数可以使这个过程变得非常简单。

library(tibble)
library(dplyr)
performance <- tribble(
   ~SubjectID, ~Outcome,
   1, 'good',
   1, 'good',
   1, 'good',
   2, 'good',
   2, 'bad',
   2, 'good',
   3, 'bad',
   4, 'good',
   4, 'good'
)

performance %>% group_by(SubjectID) %>% 
            summarize("perGood" = sum(Outcome="good")/n())

# A tibble: 4 × 2
  SubjectID perGood
      <dbl>   <dbl>
1         1   1    
2         2   0.667
3         3   0    
4         4   1
ltskdhd1

ltskdhd12#

margin默认使用c(1, 2),即行/列。我们需要按行,即1

round(proportions(freq_id, 1) * 100, 2)
            Outcome
SubjectID    bad   good
        1   0.00 100.00
        2  33.33  66.67
        3 100.00   0.00
        4   0.00 100.00

它也可以写入base R管道|>

performance |> 
   with(data = _, table(SubjectID, Outcome)) |>
   proportions(margin = 1) |>
   `*`(x = _, 100) |> 
   round(2)
      Outcome
SubjectID    bad   good
        1   0.00 100.00
        2  33.33  66.67
        3 100.00   0.00
        4   0.00 100.00

或者,我们也可以使用janitor中的adorn_函数来获得所需的格式化输出

library(janitor)
library(dplyr)
performance %>%
    tabyl(SubjectID, Outcome) %>%
    adorn_percentages() %>%
    adorn_pct_formatting(digits = 2)
 SubjectID     bad    good
         1   0.00% 100.00%
         2  33.33%  66.67%
         3 100.00%   0.00%
         4   0.00% 100.00%

相关问题