hive：查询表以匹配条件

d6kp6zgx 于 2021-06-26 发布在 Hive

关注(0)|答案(1)|浏览(409)

在配置单元中有一个包含以下数据的表。我正试着玩弄它：

A   B   C   D 
==============
76  5   0.6 107777
78  5   0.5 107777
79  5   0.5 107777
79  5   0.5 107777
80  5   0.5 107777
210 5   0.5 107777
211 5   0.5 107777
213 5   0.5 107777
316 5   0.5 107777
316 5   0.5 107777
76  7   0.5 102997
78  7   0.5 102997
79  8   0.5 102997
79  8   0.5 102997
80  9   0.5 108997
80  9   0.5 108997
80  9   0.5 108997

Need to count the 'B and D' when B>4 and C is not same for B and D.

预期o/p：
在这里查找值，其中，对于“a”和“b”中的相同值，“c”中的值不存在。同时显示表中存在的重复值（大于1）。

Value in 'C' is not present for the same value in 'A' and 'B':

这意味着

A B C
=====
76 5 0.6 => OK
78 5 0.5 => OK
79 5 0.5 => OK 
79 5 0.5 => NOT OK (As C=0.5 shouldn't have repetitive value for same A and B value)
80 5 0.5 => OK.....

A   B   C   D 
==============
79  5   0.5 107777
316 5   0.5 107777
79  8   0.5 102997
80  9   0.5 108997
80  9   0.5 108997

Count: 5

能够为第一部分编写查询，但对其余部分没有任何突破：

SELECT A,B,C,D FROM DB.TABLE1 WHERE B >1;

但不知道如何写最后一部分：

count the 'B and D' when B>4 and C is not same for B and D.

任何关于这方面的建议都会很有帮助。
更新\u 1:
尝试了以下操作：

SELECT A,B,C,D FROM (SELECT * FROM TABLE1 WHERE B >4) t1 GROUP BY B,D HAVING countnum>1 LIMIT 20;

但是得到错误：

FAILED: SemanticException [Error 10025]: Line 1:197 Expression not in GROUP BY key '1'
hive>

Hive hiveql

来源：https://stackoverflow.com/questions/52524926/hive-querying-the-table-to-match-the-conditions

1条答案

按热度按时间

5kgi1eie1#

`Need to count the 'B and D' when B>4 and C is not same for B and D.` 输入： `table1` ```
A B C D

76 5 0.6 107777
78 5 0.5 107777
79 5 0.5 107777
79 5 0.5 107777
80 5 0.5 107777
210 5 0.5 107777
211 5 0.5 107777
213 5 0.5 107777
316 5 0.5 107777
316 5 0.5 107777
76 7 0.5 102997
78 7 0.5 102997
79 8 0.5 102997
79 8 0.5 102997
80 9 0.5 108997
80 9 0.5 108997
80 9 0.5 108997

查询：

select count(*)
from (
select *, row_number() over (partition by B, C, D) as rn
from table1
where B>4
) as t1
where rn=1;

输出： `5` 说明： `row_number()` 根据的值给出行号 `B` ,  `C` 以及 `D` . 对于这三个变量值相同的行，行号将保持递增。

A B C D rn

76 5 0.6 107777 1
78 5 0.5 107777 1
79 5 0.5 107777 2
79 5 0.5 107777 3
80 5 0.5 107777 4
210 5 0.5 107777 5
211 5 0.5 107777 6
213 5 0.5 107777 7
316 5 0.5 107777 8
316 5 0.5 107777 9
76 7 0.5 102997 1
78 7 0.5 102997 2
79 8 0.5 102997 1
79 8 0.5 102997 2
80 9 0.5 108997 1
80 9 0.5 108997 2
80 9 0.5 108997 3

赞(0）回复(0）举报 2021-06-26

我来回答

hive：查询表以匹配条件

1条答案

`Need to count the 'B and D' when B>4 and C is not same for B and D.` 输入： `table1` ```
A B C D

A B C D rn

相关问题

热门标签

最新问答

hive：查询表以匹配条件

1条答案

Need to count the 'B and D' when B>4 and C is not same for B and D. 输入： table1 ```A B C D

A B C D rn

相关问题

热门标签

最新问答

`Need to count the 'B and D' when B>4 and C is not same for B and D.` 输入： `table1` ```
A B C D