什么是不在级联管道中的sql的等价物?

bq9c1y66  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(310)

我有两个带有一个公共字段的文件,根据该字段值我需要得到第二个文件值。
如何在这里添加where条件?
有没有其他不可用的管道?
文件1:

tcno,date,amt
1234,3/10/2016,1000
1234,3/11/2016,400
23456,2/10/2016,1500

文件2:

cno,fname,lname,city,phone,mail
1234,first,last,city,1234556,123@123.com

示例代码:

Pipe pipe1 = new Pipe("custPipe");
Pipe pipe2 = new Pipe("tscnPipe");
Fields cJoinField = new Fields("cno");
Fields tJoinField = new Fields("tcno");
Pipe pipe = new HashJoin(pipe1, cJoinField, pipe2, tJoinField,  new OuterJoin());
//HOW TO ADD WHERE CONDITION i.e. CNO IS NULL FROM SECOND FILE
Fields outFields = new Fields("tcno","tdate", "tamt");

我期望输出作为第一个文件的最后一行[ 23456,2/10/2016,1500 ]

aij0ehis

aij0ehis1#

根据代码中的注解:

//HOW TO ADD WHERE CONDITION i.e. CNO IS NULL FROM SECOND FILE

尝试使用 FilterNull .
在代码后面添加以下行 HashJoin 步骤:

FilterNull filterNull = new FilterNull();
pipe = new Each( pipe, cJoinField, filterNull );

比如:

Pipe pipe1 = new Pipe("custPipe");
Pipe pipe2 = new Pipe("tscnPipe");
Fields cJoinField = new Fields("cno");
Fields tJoinField = new Fields("tcno");
Pipe pipe = new HashJoin(pipe1, cJoinField, pipe2, tJoinField,  new OuterJoin());

// Filter out those tuples which has cno as null
FilterNull filterNull = new FilterNull();
pipe = new Each( pipe, cJoinField, filterNull );

Fields outFields = new Fields("tcno","tdate", "tamt");

相关问题