R语言 具有多个重复分组变量值的汇总数据框

n1bvdmb6  于 2023-10-13  发布在  其他
关注(0)|答案(2)|浏览(136)

我有以下的df:

df1<- structure(list(Type.of.Goal = c("Nutrition/Hydration Goal", "Nutrition/Hydration Goal", 
"Fitness Goal", "Fitness Goal", "Lifestyle Goal", "Fitness Goal", 
"Lifestyle Goal", "Fitness Goal", "Nutrition/Hydration Goal", 
"Nutrition/Hydration Goal", "Nutrition/Hydration Goal", "Lifestyle Goal", 
"Lifestyle Goal", "Lifestyle Goal", "Nutrition/Hydration Goal", 
"Lifestyle Goal", "Fitness Goal", "Fitness Goal", "Lifestyle Goal", 
"Lifestyle Goal", "Fitness Goal", "Lifestyle Goal", "Lifestyle Goal"
), progress_made = c(1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 
1, 1, 0, 1, 1, 1, 1, 1, 1), id = c("a", "a", "a", "b", "b", "b", 
"c", "c", "c", "c", "d", "d", "d", "e", "e", "e", "e", "f", "f", 
"f", "g", "g", "g")), row.names = c(734L, 736L, 737L, 964L, 965L, 
966L, 1446L, 1447L, 1448L, 1449L, 1485L, 1486L, 1487L, 1553L, 
1554L, 1555L, 1556L, 1918L, 1919L, 1920L, 1952L, 1953L, 1954L
), class = "data.frame")

我试图总结DF,以显示哪些人(由id列显示)1)在他们设定的所有目标上取得了进展,2)在某些目标上取得了进展,但其他目标没有,或者3)在他们设定的任何目标上都没有取得进展。
如果一个人在给定的目标上取得了进展,progress_made = 1,如果一个人没有在目标上取得进展,progress_made = 0。
对于那些只为每种类型(健身,营养/水合作用和生活方式)设定一个目标的人来说,我能够毫无问题地做到这一点,但是对于这些人来说,例如,设定三个目标,只属于两个目标类型,我一直有问题。
基本上我正在寻找一个最终的框架,具有类似的结构:

df_results<- data.frame(id= c("a", "b", "c", "d", "e", "f", "g"),
                     results= c("All goals saw progress", "No goals saw progress", 
                                 "Some goals saw progress, but not all", 
                                 "Some goals saw progress, but not all", 
                                 "Some goals saw progress, but not all", 
                                 "All goals saw progress", "All goals saw progress"))

它不一定是这个确切的结构,但这只是我需要它以某种方式传达的最终信息。
我最初的策略是将df的宽度旋转,使idType.of.Goal是列,progress_made值是单元格值。在此之后,我只是使用rowSumsmutate的组合来评估每个结果类别的值,然后使用ifelse创建一个新列,将值汇总到df_results中列出的文本类别。然而,当在一个类型下为任何给定的个人设置多个目标时,这种枢轴方法不起作用。
任何想法/帮助将不胜感激。

2admgd59

2admgd591#

我们可以使用dplyr。用case_whengroup_by id总结。

library(dplyr)

df1 |> 
    summarise(results = case_when(all(progress_made ==1) ~ "All goals saw progress",
                                  all(progress_made ==0) ~ "No goals saw progress",
                                  .default = "Some goals saw progress, but not all"
                                  ),
              .by = id
              )
  id                              results
1  a               All goals saw progress
2  b                No goals saw progress
3  c Some goals saw progress, but not all
4  d Some goals saw progress, but not all
5  e Some goals saw progress, but not all
6  f               All goals saw progress
7  g               All goals saw progress
83qze16e

83qze16e2#

一个基本的R选项:

aggregate(
  df1, progress_made ~ id, 
  FUN = \(x) {
    ifelse(
      all(x==1), 
      "All goals saw progress", 
      ifelse(
        any(x==1), 
        "Some goals saw progress, but not all",
        "No goals saw progress"
      )
    )
  } 
)

#   id                              results
# 1  a               All goals saw progress
# 2  b                No goals saw progress
# 3  c Some goals saw progress, but not all
# 4  d Some goals saw progress, but not all
# 5  e Some goals saw progress, but not all
# 6  f               All goals saw progress
# 7  g               All goals saw progress

相关问题