在缺少值的组中选择唯一行

qmb5sa22 于 2021-06-26 发布在 Hive

关注(0)|答案(3)|浏览(300)

我有一个包含两列的表，其中一列的值可能会丢失。第一列是id，第二列是value。我想为唯一id选择行，这样如果有多个行具有相同的id，但其中一些缺少值，则返回其中一个具有现有值。如果id为的所有行都有空值，则返回其中任何一行。
换句话说，只要两行具有相同的id，它们就应该属于同一组。但在每个组中，如果有值，则返回具有值的值。
例如，输入表。

+--------+---------+
|    ID  |  VALUE  |
+------------------+
| x      | 1       |
| x      | 1       |
| y      | 2       |
| y      |         |
| z      |         |
| z      |         |
+------------------+

应返回：

+------------+---------+
|    ID      |  VALUE  |
+------------+---------+
| x          | 1       |
| y          | 2       |
| z          |         |
+------------+---------+

sql Hive presto

来源：https://stackoverflow.com/questions/51345349/selecting-unique-rows-in-in-groups-with-missing-values

3条答案

按热度按时间

a11xaf1n1#

根据你的描述，你可以用 max() :

select id, max(value)
from t
group by id;

如果需要其他列，请使用 row_number() :

select t.*
from (select t.*,
             row_number() over (partition by id order by (case when value is not null then 1 else 0 end)) as seqnum
      from t
     ) t
where seqnum = 1;

赞(0）回复(0）举报 2021-06-26

p5cysglq2#

您可以轻松地将查询分为两个查询：

A: 1- find unique row with DISTINCT on (ID,Value) which are not empty VALUE
B: 2- find unique row with DISTINCT on ID which are empty in VALUE and ID not in(A(ID))

a u（b-a）

赞(0）回复(0）举报 2021-06-26

gopyfrb33#

您可以在hive/sql中使用distinct函数

hive> select distinct id,value from <db_name>.<table_name>;

上面的查询将在id、value列中返回不同的值

hive> select distinct * from <db_name>.<table_name>;

上面的语句用于基于所有列仅返回不同（不同）的值。

赞(0）回复(0）举报 2021-06-26

我来回答

在缺少值的组中选择唯一行

3条答案

相关问题

热门标签

最新问答