pig中多表的外连接

txu3uszq  于 2021-06-24  发布在  Pig
关注(0)|答案(2)|浏览(465)

我需要连接多个表。我使用的命令如下:

G = JOIN aa BY f, bb by f, cc by f, dd by f;

为了使它成为一个完整的外部连接,我添加了一个 FULL 要做到这一点:

G = JOIN aa BY f FULL, bb by f, cc by f, dd by f;

但它给了我一个 mismatched input 错误消息。我该怎么做?
谢谢!

lhcgjxsq

lhcgjxsq1#

您可以使用cogroup语句来模拟完全外部联接。例如,cogroup on使用以下两个文件
十进制.csv

first|1
second|2
fourth|4

罗马.csv

first|I 
second|II
third|III

清管器命令:

english = LOAD 'Decimal.csv' using PigStorage('|') as (name:chararray,value:chararray);
roman = LOAD 'Roman.csv' using PigStorage('|') as (name:chararray, value:chararray);
multi = cogroup english by name, roman by name;
dump multi

输出:

(first,{(first,1)},{(first,I)})
(third,{},{(third,III)})
(fourth,{(fourth,4)},{})
(second,{(second,2)},{(second,II)})
f2uvfpb9

f2uvfpb92#

根据清管器文件:
外部联接只适用于双向联接;要执行多路外部联接,需要执行多个双向外部联接语句。

相关问题