我需要知道是否强制使用 FOREACH
对于apache pig中的任何关系转换。你能帮助我了解以下哪种方法更好,有助于提高绩效吗。文件大小很大。
方法1:
A = LOAD 'input1' USING PigStorage(',') AS (id:int, name:chararray);
B = LOAD 'input2' USING PigStorage(',') AS (id:int, dept:int, dname:chararray);
C = JOIN A by id, B by id;
方法2:
A = LOAD 'input1' USING PigStorage(',') AS (id:int, name:chararray);
B = LOAD 'input2' USING PigStorage(',') AS (id:int, dept:int, dname:chararray);
C = FOREACH A GENERATE id, name;
D = FOREACH B GENERATE id, dname;
E = JOIN C by id, D by id;
DUMP E;
暂无答案!
目前还没有任何答案,快来回答吧!