如果我在下面的输出中使用group by,我会得到一个第3列不存在的错误-我已经确定这是因为当我group_by(person,month,revenue)
时,我没有得到任何结果的返回。
样品:
df <- structure(list(person = c("a", "a",
"a", "a", "a", "b",
"b", "b", "b", "b",
"c", "c", "c", "c",
"c", "c", "d", "d",
"d", "d", "d"), report_month = c("01/01/2023", "02/01/2023", "03/01/2023", "11/01/2022", "12/01/2022", "01/01/2023",
"02/01/2023", "03/01/2023", "11/01/2022", "12/01/2022", "03/01/2023",
"01/01/2023", "02/01/2023", "03/01/2023", "11/01/2022", "12/01/2022",
"01/01/2023", "02/01/2023", "03/01/2023", "11/01/2022", "12/01/2022"
), revenue = c(-100, -805, -70, -26,
-144, 77, 129, 10, 638, 898, -178.3,
-133, -162, 2365, -480, -24, -393,
-2088, -73, -41940, -25)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -21L))
# What I can do but adds the sum of a,b,c,d for all months to person 'b' where I need person 'b' to get the sum for their row by month the summarization is grouped in
# If I run group by after the filter I cannot accomplish this as each row is a unique grouping as I have to group by all attributes
df %>%
rows_update(
data.frame(person = "b",
df %>%
filter(grepl('a|b|c|d',person,ignore.case = T)) %>%
summarise(across(c(3), sum))
),
by = "person")
我怎么能
df %>%
rows_update(
data.frame(person = "b",
df %>%
filter(grepl('a|b|c|d',person,ignore.case = T)) %>%
summarise(across(c(3), sum))
),
by = "person")
并设置好它,这样我的输出是:
person month total
a 1 original value
a 2 original value
a 3 original value
a 4 original value
a 5 original value
b 1 sum of all for this month
b 2 sum of all for this month
b 3 sum of all for this month
b 4 sum of all for this month
b 5 sum of all for this month
当只考虑一个月的值时,上面的方法可以很好地工作-但是当每个变量出现在多个月中,并且我希望b得到该特定月份所有变量的总值时-我的分组是按中断的。当前输出将显示如果你插入这个,它是什么。用户'b'得到所有月份的总和。谢谢!
1条答案
按热度按时间1l5u6lss1#
是否要对每个人和每个月的所有收入求和?
创建于2023年3月13日,使用reprex v2.0.2