在pig中按组计算1和0

bihw5rsg  于 2021-06-03  发布在  Hadoop
关注(0)|答案(1)|浏览(250)

如何计算每种类型的事件有多少个1和0?我在Pig里做这些,第二个区域只有1和0。数据如下所示:

(pageLoad,1)
(pageLoad,0)
(pageLoad,1) 
(appLaunch,1)
(appLaunch,0)
(otherEvent,1) 
(otherEvent,0)
(event,1)
(event,1)
(event,0)
(somethingelse,0)

输出是这样的

pageLoad 1:234 0:2359
appLaunch 1:54 0:111
event 1:345 0:0

或者

type 1 0 
pageLoad 21 345
appLaunch 0 123
event 234 12

谢谢大家。

insrf1ej

insrf1ej1#

输入:

pageLoad,1
pageLoad,0
pageLoad,1 
appLaunch,1
appLaunch,0
otherEvent,1 
otherEvent,0
event,1
event,1
event,0
somethingelse,0

Pig脚本:

A = LOAD 'input.csv'  USING  PigStorage(',') AS (event_type:chararray,status:int);
B = GROUP A BY event_type;
req = FOREACH B {
    event_type_1 = FILTER A BY status==1;
    event_type_0 = FILTER A BY status==0;
    GENERATE group AS event_type, COUNT(event_type_1) AS event_type_1_count, COUNT(event_type_0) AS event_type_0_count;
};  
DUMP req;

输出:

(event,2,1)
(pageLoad,2,1)
(appLaunch,1,1)
(otherEvent,1,1)
(somethingelse,0,1)

相关问题