pigscript逻辑实现

v09wglhw  于 2021-06-24  发布在  Pig
关注(0)|答案(2)|浏览(277)

我是新来的。我在做下面的例子时被卡住了。有谁能帮助我如何使用pigscript获得下面指定的输出吗?
输入:

1|ABC|NC  
1|DEF|NC  
2|CFD|NY  
2|CGF|NY

输出:

1|ABC,DEF|NC  
2|CFD,CGF|NY

脚本:

A = LOAD 'testfile.txt' USING PigStorage('|') AS (Id:chararray,name:chararray,state:chararray);
B = FOREACH A GENERATE Id,name;
C = FOREACH A GENERATE Id,name,state;
C = DISTINCT C;
GROUPED = GROUP B BY Id;
D = FOREACH GROUPED GENERATE group AS Id,c.name AS name_val;
E = JOIN D BY Id, C BY Id;
X = FOREACH E GENERATE D.Id as docid,D.name_val as termid,C.state;
Dump X;
2skhul33

2skhul331#

加载数据并按第1列和第3列分组,然后生成列,以获得所需的输出。

A = LOAD 'input.txt' USING PigStorage('|') AS (f1:int,f2:chararray;f3:chararray);
B = GROUP A BY f1,f3;
C = FOREACH B GENERATE FLATTEN(group) as (f1,f3),A.f2 AS f2;
D = FOREACH C GENERATE f1,f2,f3;
DUMP D;
nzrxty8p

nzrxty8p2#

Its working as now expected after adding BagToString method.

A = LOAD 'testfile.txt' USING PigStorage('|') AS (f1:int,f2:chararray,f3:chararray);          
B = GROUP A BY (f1,f3);        
C = FOREACH B GENERATE FLATTEN(group) as (f1,f3),A.f2 AS f2;  
D = FOREACH C GENERATE f1,  BagToString(f2, ','), f3;  
STORE D INTO '[path]' USING PigStorage('|');

相关问题