R语言 在表格中显示计算数据,省略相应列的值

yi0zb3m4  于 2023-02-27  发布在  其他
关注(0)|答案(2)|浏览(145)

为这个尴尬的标题道歉;我希望这一点很快就会明朗化。
我有这样的数据:人员被分配到特定的地点,并记录事件是否成功或数据是否缺失。

df <- data.frame(PersonID = c(1:20),
                 Location = c("B","A","D","C","A","D","C","D","A","D","B","A","D","C","A","D","C","D","A","D"),
                 Success = c("yes","no","yes",NA,"yes","no","no","yes",NA,"yes","no","yes",NA,"yes","no","no","yes",NA,"yes","no"))

我想知道每个位置相对于其他位置的"表现"如何,即有多少有效的尝试已经成功,以及该位置的成功率与其他位置相比如何。
因此,在我的示例中,地点"A"经历了5次有效尝试(1次"NA"),其中3次成功(60%)。其他地点的成功率分别为50%、66.7%和50%,平均值为55.6%。因此,地点A比其他地点的平均值高4.4个百分点。我希望在如下表中显示所有这些信息:

我对软件包没有特别的偏好,但是我喜欢并且知道一些gtflextable
提前感谢大家!

41zrol4v

41zrol4v1#

这在任何方面都不是微不足道的:

library(dplyr)
library(tidyr)
library(purrr)
library(tibble)
library(janitor)
library(gt)

df1 <- df %>% 
  group_by(Location) %>% 
  mutate(attempts = sum(!is.na(Success)),
         yeses = sum(Success == "yes", na.rm = TRUE),
         Success_rate = (yeses/attempts)*100) 
  
df2 <- df1 %>% 
  summarise(avgother = mean(Success_rate)) %>% 
  mutate(avgother = map_dbl(row_number(), ~mean(avgother[-.x])))
  )

df %>% 
  group_by(Location) %>% 
  summarise(attempts = sum(!is.na(Success)),
         yeses = sum(Success == "yes", na.rm = TRUE),
         Success_rate = (yeses/attempts)*100) %>% 
  bind_cols(avgother= round(df2$avgother, 1)) %>% 
  mutate(comp.avg = Success_rate - avgother) %>% 
  mutate(`attempts` = paste0("(N=", attempts, ")"),
         Success_rate = paste0(round(Success_rate, 1), "%")) %>% 
  select(-yeses) %>% 
  mutate(comp.avg = ifelse(comp.avg >0, paste0("(+",round(comp.avg, 1),")"), paste0("(",round(comp.avg,1),")"))) %>% 
  t() %>% 
  as.data.frame() %>% 
  rownames_to_column("Location") %>% 
  row_to_names(row_number = 1) %>% 
  gt()

oyxsuwqo

oyxsuwqo2#

来自@TarJae的回答很好,类似的方法与下面的flextable。

library(dplyr)
library(flextable)

temp <- df |> 
  group_by(Location) |> 
  summarise(Attempts = sum(!is.na(Success)), 
            Successful = sum(ifelse(Success=="yes", 1,0), na.rm = T)) |> 
  ungroup() |> 
  mutate(success_rate = round(Successful / Attempts, 3)*100) |> 
  mutate(excl.mean = (sum(success_rate) - success_rate)/(n()-1)) |> 
  mutate(comp.avg = round(success_rate - excl.mean,1)) |> 
  mutate(success_rate = paste0(formatC(signif(success_rate,digits=3),
                                digits=3,format="fg", flag="#"), "%"),
         comp.avg = ifelse(comp.avg >=0,
                           paste0("(+", comp.avg, ")"),
                           paste0("(", comp.avg, ")"))
         ) |> 
  select(-c(Successful, excl.mean)) |> 
  t() |> 
  as.data.frame()

temp <- cbind(Col1 = c("Location", "# of attempts", "Success rate", "comp. avg."),
              temp)

ft <- flextable(temp)
ft <- theme_vanilla(ft)
ft <- delete_part(ft, part = "header")
ft <- hline_top(ft)
ft <- border_inner_v(ft)
ft <- border_outer(ft)
ft

相关问题