val someDF = Seq(
(4623874, "user1", "success"),
(4623874, "user2","fail"),
(4623874, "user3","success"),
(1343244, "user4","fail"),
(4235252, "user5", "fail")
).toDF("primaryid", "user","status")
这是输入Dataframe是否可以获取除groupby之外的每个主id的计数状态
someDF.groupBy("primaryid", "status").count.show
+-------+-------+-----+
primaryid| status|count|
+-------+-------+-----+
|4235252| fail| 1|
|1343244| fail| 1|
|4623874| fail| 1|
|4623874|success| 2|
+-------+-------+-----+
除了“groupby”之外,还有其他方法可以得到上述结果吗?
1条答案
按热度按时间ruarlubt1#
使用
count
窗口功能。检查以下代码。