此问题已在此处有答案:
Filter R dataframe to n most frequent cases and order by frequency(2个答案)
Find the most frequent value in a column and take a subset of that(1个答案)
4天前关闭。
我有一个这样的dataframe:
> dput(dt)
structure(list(ID = 1:10, City = c("New York", "New York", "LA",
"LA", "LA", "Boston", "Chicago ", "New York", "LA", "New York"
), Random_Info_To_Keep = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L)), class = "data.frame", row.names = c(NA, -10L))
如果行包含数据集中最常见的2个城市(纽约/洛杉矶),我只想保留数据。输出应该如下所示:
> dput(dt2)
structure(list(ID = c(1L, 2L, 3L, 4L, 5L, 8L, 9L, 10L), City = c("New York",
"New York", "LA", "LA", "LA", "New York", "LA", "New York"),
Random_Info_To_Keep = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L)), class = "data.frame", row.names = c(NA,
-8L))
2条答案
按热度按时间s4n0splo1#
碱基R:
sshcrbum2#
您可以首先计算每个城市在数据集中出现的次数,然后过滤数据中出现次数最多的两个城市。以下是
tidyverse
解决方案: