R语言 在考虑月份分组时,根据其他行的筛选器将总和添加到特定行值

jjhzyzn0  于 2023-03-15  发布在  其他
关注(0)|答案(1)|浏览(96)

如果我在下面的输出中使用group by,我会得到一个第3列不存在的错误-我已经确定这是因为当我group_by(person,month,revenue)时,我没有得到任何结果的返回。
样品:

df <- structure(list(person = c("a", "a", 
                             "a", "a", "a", "b", 
                             "b", "b", "b", "b", 
                             "c", "c", "c", "c", 
                             "c", "c", "d", "d", 
                             "d", "d", "d"), report_month = c("01/01/2023", "02/01/2023", "03/01/2023", "11/01/2022", "12/01/2022", "01/01/2023", 
                                                              "02/01/2023", "03/01/2023", "11/01/2022", "12/01/2022", "03/01/2023", 
                                                              "01/01/2023", "02/01/2023", "03/01/2023", "11/01/2022", "12/01/2022", 
                                                              "01/01/2023", "02/01/2023", "03/01/2023", "11/01/2022", "12/01/2022"
                             ), revenue = c(-100, -805, -70, -26, 
                                                 -144, 77, 129, 10, 638, 898, -178.3, 
                                                 -133, -162, 2365, -480, -24, -393, 
                                                 -2088, -73, -41940, -25)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -21L))

# What I can do but adds the sum of a,b,c,d for all months to person 'b' where I need person 'b' to get the sum for their row by month the summarization is grouped in
# If I run group by after the filter I cannot accomplish this as each row is a unique grouping as I have to group by all attributes

df %>% 
  rows_update(
    data.frame(person = "b",
               df %>% 
                 filter(grepl('a|b|c|d',person,ignore.case = T)) %>%
                 summarise(across(c(3), sum))
    ),
    by = "person")

我怎么能

df %>% 
  rows_update(
    data.frame(person = "b",
               df %>% 
                 filter(grepl('a|b|c|d',person,ignore.case = T)) %>%
                 summarise(across(c(3), sum))
    ),
    by = "person")

并设置好它,这样我的输出是:

person month total
a 1 original value
a 2 original value
a 3 original value
a 4 original value
a 5 original value
b 1 sum of all for this month
b 2 sum of all for this month
b 3 sum of all for this month
b 4 sum of all for this month
b 5 sum of all for this month

当只考虑一个月的值时,上面的方法可以很好地工作-但是当每个变量出现在多个月中,并且我希望b得到该特定月份所有变量的总值时-我的分组是按中断的。当前输出将显示如果你插入这个,它是什么。用户'b'得到所有月份的总和。谢谢!

1l5u6lss

1l5u6lss1#

是否要对每个人和每个月的所有收入求和?

library(tidyverse)
library(lubridate)

df <- structure(list(person = c("a", "a", 
                                "a", "a", "a", "b", 
                                "b", "b", "b", "b", 
                                "c", "c", "c", "c", 
                                "c", "c", "d", "d", 
                                "d", "d", "d"), report_month = c("01/01/2023", "02/01/2023", "03/01/2023", "11/01/2022", "12/01/2022", "01/01/2023", 
                                                                 "02/01/2023", "03/01/2023", "11/01/2022", "12/01/2022", "03/01/2023", 
                                                                 "01/01/2023", "02/01/2023", "03/01/2023", "11/01/2022", "12/01/2022", 
                                                                 "01/01/2023", "02/01/2023", "03/01/2023", "11/01/2022", "12/01/2022"
                                ), revenue = c(-100, -805, -70, -26, 
                                               -144, 77, 129, 10, 638, 898, -178.3, 
                                               -133, -162, 2365, -480, -24, -393, 
                                               -2088, -73, -41940, -25)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -21L))

df |> 
  mutate(month = month(dmy(report_month))) |> 
  summarize(revenue = sum(revenue),
            .by = c(person, month))
#> # A tibble: 4 × 3
#>   person month revenue
#>   <chr>  <dbl>   <dbl>
#> 1 a          1  -1145 
#> 2 b          1   1752 
#> 3 c          1   1388.
#> 4 d          1 -44519

创建于2023年3月13日,使用reprex v2.0.2

相关问题