我有一个大的数据集(49000 X 118),我想做的是我想按一列分组,然后有多列的摘要。我的数据的问题是,每列的摘要有不同的长度。
下面是我的数据的一个简单示例
dat<- data.frame(test_number= as.factor(c("test1", "test1", "test1","test1","test1","test1", "test2","test2","test2", "test3","test3","test3","test3","test3","test3")),
question1_response= as.factor(c("yes", NA, "no","not answered", "yes", "yes", NA, "no","yes","yes","yes","yes","yes","yes","yes")),
question2_response= as.factor(c("yes","yes","yes","yes","yes","yes","yes","yes","yes","yes","yes","yes","yes","yes","no")),
question3_response= as.factor(c("yes", NA, "no","yes", NA, "no","yes", NA, "no","yes", NA, "no","yes", NA, "no")))
我想按test_number
分组,并在2:4
列中获得每个响应的摘要
我试过的一些代码:
x一个一个一个一个x一个一个二个一个x一个一个三个一个
我期望结果是这样的(我是在excel中做的)
我用NAs替换了不相等的列长,但是只要我得到信息,我对结构并不特别。
谢谢
2条答案
按热度按时间368yc8dk1#
您可以将所有的问题回答堆叠到一列中,然后使用
values_fn = length
转换为宽格式进行计数。mqkwyuun2#
这就是你想要的吗?