pig:透视和求和3关系

n3ipq98p  于 2021-05-29  发布在  Hadoop
关注(0)|答案(2)|浏览(433)

我有3种不同的关系,如下所述&我可以使用udf获得输出,但需要在pig中实现。在论坛上提到了其他东西,但没有得到关于这个问题的具体想法。
过程:

FN1,10
FN2,20
FN3,23
FN4,25
FN5,15
FN7,40
FN10,56

雷杰:

FN1,12
FN2,13
FN3,33
FN6,60
FN8,23
FN9,44
FN10,4

所有fn:

FN1
FN2
FN3
FN4
FN5
FN6
FN7
FN8
FN9
FN10

所需输出为:

FN1,10,12,22
FN2,20,13,33
FN3,23,33,56
FN4,25,0,25
FN5,15,0,15
FN6,0,60,60
FN7,40,0,40
FN8,0,23,23
FN9,0,44,44
FN10,56,4,60
zwghvu4y

zwghvu4y1#

您可以使用cogroup来实现这一点

r9f1avp5

r9f1avp52#

asuming您的关系在test.txt test2.txt和test3.txt中

A = LOAD 'test.txt' using PigStorage(',');
B = LOAD 'test2.txt' using PigStorage(',');
C = LOAD 'test3.txt' using PigStorage(',');
D = COGROUP A by $0, B by $0;
E = COGROUP C by $0, D by $0;
F = FOREACH E generate $0, FLATTEN(D.A), FLATTEN(D.B);
G = FOREACH F generate $0, $1.$1, $2.$1;
H = FOREACH G generate $0, FLATTEN((IsEmpty($1)?null:$1)), FLATTEN((IsEmpty($2)?null:$2));
I = foreach H generate $0, ($1 is null?0:$1),($2 is null?0:$2),($1 is null?0:$1)+($2 is null?$0:$2);
dump I;

输出

(FN1,10,12,22)
(FN2,20,13,33)
(FN3,23,33,56)
(FN4,25,0,)
(FN5,15,0,)
(FN6,0,60,60)
(FN7,40,0,)
(FN8,0,23,23)
(FN9,0,44,44)
(FN10,56,4,60)

相关问题