在where predicate 解决方案中配置多个子查询

pb3skfrl  于 2021-05-29  发布在  Hadoop
关注(0)|答案(3)|浏览(413)

我有三个表,我想查询 table3 使用条件基于 table1 以及 table2 . 以下是数据和查询的简化版本:

CREATE TABLE table1 (
  id int
);

INSERT INTO table1 VALUES(1);
INSERT INTO table1 VALUES(2);
INSERT INTO table1 VALUES(3);

+------------+--+
| table1.id  |
+------------+--+
| 1          |
| 2          |
| 3          |
+------------+--+

CREATE TABLE table2 (
  code varchar(10)
);

INSERT INTO table2 VALUES('a');
INSERT INTO table2 VALUES('b');
INSERT INTO table2 VALUES('c');

+--------------+--+
| table2.code  |
+--------------+--+
| a            |
| b            |
| c            |
+--------------+--+

CREATE TABLE table3 (
  id int,
  code varchar(10)
);

INSERT INTO table3 VALUES(1,'d');
INSERT INTO table3 VALUES(1,'a');
INSERT INTO table3 VALUES(2,'b');
INSERT INTO table3 VALUES(2,'e');
INSERT INTO table3 VALUES(4,'a');
INSERT INTO table3 VALUES(4,'d');

+------------+--------------+--+
| table3.id  | table3.code  |
+------------+--------------+--+
| 1          | d            |
| 1          | a            |
| 2          | b            |
| 2          | e            |
| 4          | a            |
| 4          | d            |
+------------+--------------+--+

基本上,我喜欢从 table3 只有 id 存在于 table1 以及 code 不存在于 table2 . 所以结果应该是公正的

1,d
2,e

以下查询不起作用:

SELECT * FROM table3 WHERE (table3.id IN (SELECT table1.id FROM
table1)) AND NOT (table3.code IN (SELECT table2.code FROM table2));

我有个错误:
错误:编译语句时出错:失败:semanticexception[error 10249]:行1:94不支持的子查询表达式“code”:仅支持1个子查询表达式(州=42000,代码=10249)
独立地,每个条件都可以正常工作:

SELECT * FROM table3 WHERE (table3.id IN (SELECT table1.id FROM table1));

+------------+--------------+--+
| table3.id  | table3.code  |
+------------+--------------+--+
| 1          | d            |
| 1          | a            |
| 2          | b            |
| 2          | e            |
+------------+--------------+--+

SELECT * FROM table3 WHERE NOT (table3.code IN (SELECT table2.code FROM table2));

+------------+--------------+--+
| table3.id  | table3.code  |
+------------+--------------+--+
| 1          | d            |
| 2          | e            |
| 4          | d            |
+------------+--------------+--+

重要提示:我做不到 JOIN 或者修改文件中的任何内容 FROM 因为这是报告系统的一部分,所以我唯一能做的就是调整where子句。

wkftcu5l

wkftcu5l1#

NOT IN 是编写查询的更简单方法:

SELECT t3.*
FROM table3 t3
WHERE t3.id IN (SELECT table1.id FROM table1) AND
      t3.code NOT IN (SELECT table2.code FROM table2);

如果您仅限于一个子查询,这会变得很棘手。这里有一种可能性,尽管我不确定hive是否会接受:

where exists (select 1
              from table1 t1
              where t1.id = t3.id and
                    not exists (select 1
                                from table2 t2
                                where t2.code = t3.code
                               )
             )

您可以在不使用双重嵌套的情况下执行此操作:

where exists (select 1
              from table1 t1 left join
                   table2 t2
                   on t2.code = t3.code
              where t1.id = t3.id
             )
2wnc66cl

2wnc66cl2#

你可以用 JOIN 学生:

SELECT DISTINCT t3.*
FROM table3 t3
JOIN table1 t1
  ON t3.id = t1.id
LEFT JOIN table2 t2
  ON t2.code = t3.code
WHERE t2.code IS NULL;

dbfiddle演示
我唯一能做的就是修改where子句。

SELECT *
FROM Table3 t
WHERE EXISTS (SELECT 1
              FROM table3 t3
              JOIN table1 t1
                ON t3.id = t1.id
              LEFT JOIN table2 t2
                ON t2.code = t3.code
             WHERE t2.code IS NULL
               AND t3.id = t.id
               AND t3.code = t.code)

D小提琴演示2

0md85ypi

0md85ypi3#

你可以使用的一个肮脏的技巧是交叉连接 table1 以及 table2 ,既然你不在乎他们的关系,就用 exists 条件:

SELECT *
FROM   table3 
WHERE  NOT EXISTS (SELECT     *
                   FROM       table1
                   CROSS JOIN table2
                   WHERE      table3.id = table1.id ON table3.code = table2.code)

编辑:
虽然上面的查询应该可以工作,但它的性能可能不会很好。一个稍微快一点的变体是 union all 在子查询中:

SELECT *
FROM   table3 
WHERE  NOT EXISTS (SELECT     *
                   FROM       table1
                   WHERE      table3.id = table1.id 
                   UNION ALL
                   SELECT     *
                   FROM       table2
                   WHERE      table3.code = table2.code)

相关问题