R语言 根据其他列中的条件添加计算列

nnt7mjpx  于 2023-04-27  发布在  其他
关注(0)|答案(2)|浏览(166)

我是R的新手,正在寻找当其他列有相应的行条目时对一列求和的方法。
我正在处理这个(简化的)dataset。它总共有78行。
我使用了以下函数
group_by(type_of_pastry, allergen) %>% dplyr::summarize(n = n())
生成this table summary
下一个任务是根据糕点的类型,用“Volume”列中的值的总和填充最新的列“Total”。例如,第1行和第2行(牛角面包)应显示值43,第3行和第4行(奶酪奶油蛋卷)应分别显示值35。
在花了很多时间寻找解决方案之后,他们中的大多数人将我引向了aggregate,这将改变数据集的结构,而目标是保持上述结构。任何指导都将非常感谢。谢谢!!

ajsxfq5m

ajsxfq5m1#

以下是使用countdplyr解决方案:

library(dplyr) #>= 1.1.0

df %>% 
  count(type_of_pastry, allergen, name="count") %>% 
  mutate(Total = sum(count), .by=type_of_pastry)

 type_of_pastry allergen count Total
1 Cheese brioche     None     1     5
2 Cheese brioche      Nut     4     5
3      Croissant     None     4     6
4      Croissant      Nut     2     6

数据:

df <- structure(list(order = c("C-23232", "D-35253", "F-43953", "D-48205", 
"D-12659", "C-82645", "A-58344", "F-83759", "C-73213", "A-98732", 
"F-42842"), type_of_pastry = c("Croissant", "Croissant", "Cheese brioche", 
"Cheese brioche", "Croissant", "Croissant", "Croissant", "Cheese brioche", 
"Cheese brioche", "Cheese brioche", "Croissant"), allergen = c("Nut", 
"None", "Nut", "None", "None", "None", "None", "Nut", "Nut", 
"Nut", "Nut")), class = "data.frame", row.names = c(NA, -11L))
w9apscun

w9apscun2#

尝试以下操作,但会删除其他列:

df %>%
  group_by(type_of_pastry) %>%
  summarise(Total = sum(count))
# A tibble: 2 × 2
  type_of_pastry Total
  <chr>          <dbl>
1 Cheese brioche    35
2 Croissant         43

或者这个,它保留了其他列(但复制了Total):

df %>%
  group_by(type_of_pastry) %>%
  mutate(Total = sum(count))
# A tibble: 4 × 4
# Groups:   type_of_pastry [2]
  type_of_pastry allergen count Total
  <chr>          <chr>    <dbl> <dbl>
1 Croissant      Nut         23    43
2 Croissant      None        20    43
3 Cheese brioche Nut         18    35
4 Cheese brioche Milk        17    35

或者这个,它以紧凑的形式给你所有的东西:

编辑1

df %>%
  group_by(type_of_pastry) %>%
  summarise(
    Total = sum(count),
    allergen = str_c(allergen, collapse = ", "),
    count = str_c(count, collapse = ", "))
# A tibble: 2 × 4
  type_of_pastry Total allergen  count 
  <chr>          <dbl> <chr>     <chr> 
1 Cheese brioche    35 Nut, Milk 18, 17
2 Croissant         43 Nut, None 23, 20

编辑2

df %>%
  #group_by(type_of_pastry) %>%
  summarise(
    Total = sum(count),
    allergen = str_c(allergen, collapse = ", "),
    count = str_c(count, collapse = ", ")
   )
  Total             allergen          count
1    78 Nut, None, Nut, Milk 23, 20, 18, 17

相关问题