我有一个 Dataframe df1
,其中我有属于不同地区(df 1 $Regions
)的不同站点(df 1 $Site
),其中我有关于食草证据及其类型的数据(df 1 $Herbivory_type
)。当没有食草时,df 1 $Herbivory_type
是NA
。下面我展示了我的 Dataframe 的一个例子:
df1 <- data.frame(Region=c("ALI1","ALI1","ALI1","ALI1","ALI2","ALI2","ALI2","ALI3","ALI3","ALI3","ALI3","ALI5","ALI5"),
Site=c("ALI1_A","ALI1_B","ALI1_C","ALI1_D","ALI2_A","ALI2_B","ALI2_C","ALI3_A","ALI3_B","ALI3_C","ALI3_D","ALI5_A","ALI5_B"),
Herbivory_type=c(NA,"S",NA,NA,NA,NA,NA,NA,"S","S",NA,NA,"S"))
df1$Herbivory_type <- as.factor(df1$Herbivory_type)
df1
Region Site Herbivory_type
1 ALI1 ALI1_A <NA>
2 ALI1 ALI1_B S
3 ALI1 ALI1_C <NA>
4 ALI1 ALI1_D <NA>
5 ALI2 ALI2_A <NA>
6 ALI2 ALI2_B <NA>
7 ALI2 ALI2_C <NA>
8 ALI3 ALI3_A <NA>
9 ALI3 ALI3_B S
10 ALI3 ALI3_C S
11 ALI3 ALI3_D <NA>
12 ALI5 ALI5_A <NA>
13 ALI5 ALI5_B S
我需要知道在df1$Site
的计数中考虑到NA
的地区食草性事件的数量。我希望得到以下结果:
df2
Region N_Hervivory_S
1 ALI1 1
2 ALI2 0 # All sites have `NA`, thus, herbivorims is 0 in this region.
3 ALI3 2
4 ALI5 1
我试过这个:
as.data.frame(df1 %>% group_by(Region,Herbivory_type) %>% summarise(N = n()))
但产量不是我所期望的
Region Herbivory_type N
1 ALI1 S 1
2 ALI1 <NA> 3
3 ALI2 <NA> 3
4 ALI3 S 2
5 ALI3 <NA> 2
6 ALI5 S 1
7 ALI5 <NA> 1
有人知道怎么做吗?
先谢了
3条答案
按热度按时间goucqfw61#
您可以使用
count()
按组对!is.na(Herbivory_type)
求和,并获得每个区域的非缺失值的数量。jgovgodb2#
(假设真实的数据集中可能有其他类别需要忽略-否则
!is.na()
更简单)qhhrdooz3#
您可以计算非NA,即