对于每一年,我想创建两个新列temp_count
和rh_count
,分别计算每个列temp_catog
和humidity_catog
中出现的次数。如果您按一个变量分组,How to count how many values per level in a given factor?会回答这个问题,但我想使用group_by(year, humidity_catog, temp_catog)
。
我可以使用以下代码创建一个列humidity_count
来计算每个类别humidity_catog
列中出现的次数。
df <- group_by(year, humidity_catog) %>%
summarize(humidity_count = n())
以下是输出
但是我想在同一个数据框中创建另一个列temp_count
来统计每个类别temp_count
列的数量,我该如何实现呢?下面是我通过dput函数创建的数据的可重现示例。
df <- structure(
list(
year = structure(
c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L),
.Label = c(
"2006",
"2007",
"2012",
"2013",
"2014",
"2014_c",
"2015_a",
"2015_b",
"2016",
"2017",
"2020"
),
class = "factor"
),
min_rh = c(47.9, 49, 44.7, 40.2, 50, 52.3, 51.5, 82.8, 73.8,
47.1),
min_temp = c(12.4, 14.3, 15.1, 16.1, 12.7, 16.1, 14.4,
15.1, 11.8, 9.5),
temp_catog = structure(
c(2L, 2L, 3L, 3L,
2L, 3L, 2L, 3L, 2L, 2L),
.Label = c("T1(<=8)", "T2(>8, <=15)",
"T3(>15)"),
class = "factor"
),
humidity_catog = structure(
c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L),
.Label = c("RH1(<=65)",
"RH2(>65)"),
class = "factor"
)
),
class = c("grouped_df",
"tbl_df", "tbl", "data.frame"),
row.names = c(NA,-10L),
groups = structure(
list(
year = structure(
1L,
.Label = c(
"2006",
"2007",
"2012",
"2013",
"2014",
"2014_c",
"2015_a",
"2015_b",
"2016",
"2017",
"2020"
),
class = "factor"
),
.rows = structure(
list(1:10),
ptype = integer(0),
class = c("vctrs_list_of",
"vctrs_vctr", "list")
)
),
class = c("tbl_df", "tbl", "data.frame"),
row.names = c(NA,-1L),
.drop = TRUE
)
)
注意:我不需要唯一的匹配项。我只需要计算每个类别被记录的次数。
1条答案
按热度按时间pkwftd7m1#
不确定OP是如何合并两个汇总结果的,但是我们可以依次调用
mutate
而不是summarise
,将分组变量提供给.by
参数。这个玩具的数据框是按年分组的,我事先把它取消了分组