配置单元-或具有左外部联接的条件

hmae6n7t 于 2021-05-29 发布在 Hadoop

关注(0)|答案(1)|浏览(413)

对于类似的案件，我已经把所有的问题都转了过来。虽然这个错误可能很常见，但我正在寻找针对具体案例的解决方案。请不要将问题标记为重复，除非您得到完全相同的情况下接受的解决方案。
我有两张table

Main table:

c1  c2  c3  c4  c5
1   2   3   4   A

Other table

c1  c2  c3  c4  c5
1   8   5   6   B
8   2   8   9   C
8   7   3   9   C
8   7   9   4   C
5   6   7   8   D

现在，从另一个表中，我应该只能在所有列中选择唯一的记录。e、 g.最后一排( 5,6,7,8, D )只是。
另一个表的行1被拒绝，因为c1值（1）和主表中的c1值（1）相同，行2被拒绝，因为另一个表和主表的c2值匹配，同样。。。
简而言之，在查询的输出中，其他表中的任何列都不应该在主表中具有相同的值（在相应的列中）。
我试着创建下面的查询

select t1.* from otherTable t1
LEFT OUTER JOIN mainTable t2
ON ( t1.c1 = t2.c1 OR t1.c2 = t2.c2 OR t1.c3 = t2.c3 OR t1.c4 = t2.c4 )
Where t2.c5 is null;

但是，hive抛出以下异常
或当前不支持加入
我了解Hive的局限性，我用过很多次 UNION (ALL | DISTINCT) 用内连接来克服这一局限性；但不能用同样的策略来解决这个问题。
请帮忙。
编辑1:我有配置单元版本限制-只能使用版本1.2.0

hadoop Hive union outer-join

来源：https://stackoverflow.com/questions/45575548/hive-or-condition-with-left-outer-join

1条答案

按热度按时间

a6b3iqyw1#

可以进行笛卡尔积联接（无条件的内部联接）：

select t1.* from otherTable t1
,mainTable t2
WHERE  t1.c1 != t2.c1 AND t1.c2 != t2.c2 
       AND t1.c3 != t2.c3 AND t1.c4 != t2.c4 AND t1.c5 !=  t2.c5;

假设你有一排 mainTable 这个查询应该和使用 OUTER JOIN 另一种选择是将建议的查询分成5个不同的部分 LEFT OUTER JOIN 子查询：

select t1.* from (
  select t1.* from (
    select t1.* from (
      select t1.* from (
        select t1.* from otherTable t1
        LEFT OUTER JOIN (select distinct c1 from mainTable) t2
        ON ( t1.c1 = t2.c1) Where t2.c1 is null ) t1
      LEFT OUTER JOIN (select distinct c2 from mainTable) t2
      ON ( t1.c2 = t2.c2) Where t2.c2 is null ) t1
    LEFT OUTER JOIN (select distinct c3 from mainTable) t2
    ON ( t1.c3 = t2.c3) Where t2.c3 is null ) t1
  LEFT OUTER JOIN (select distinct c4 from mainTable) t2
  ON ( t1.c4 = t2.c4) Where t2.c4 is null ) t1
LEFT OUTER JOIN (select distinct c5 from mainTable) t2
ON ( t1.c5 = t2.c5) Where t2.c5 is null
;

在这里，对于每一列，我首先从 mainTable 把剩下的加入其中 otherTable . 缺点是我超过了5次 mainTable -每列一次。如果主表中的值是唯一的，则可以删除 distinct 从子查询。

赞(0）回复(0）举报 2021-05-29

我来回答

配置单元-或具有左外部联接的条件

1条答案

相关问题

热门标签

最新问答