apache配置单元:从每个组取一行

dl5txlt9  于 2021-06-27  发布在  Hive
关注(0)|答案(1)|浏览(319)

我有一个apache hive表,如下所示:

id          llvc    lp
2428766324  P005    P048
2428766324  P005    P024
2428766324  P005    NULL
2429788401  P005    P024
2429788401  P005    NULL
2429788401  P005    P048
2457843473  P005    P024
2457843473  P005    P048
2457843473  P005    NULL
2457872560  P005    NULL
2457872560  P005    P048
2457872560  P005    P024

对于每个id,我有一个/多个lign,并且我希望在如下条件下为每个id取一行:
对于每组id

If number of line = 1, take this line    
if number of line > 1, take the line where llvc = lp    
if number of line > 1, and no llvc == lp, take the line where lp = null

并放弃组中的其他行。
例如:

id          llvc    lp     
2428766324  P005    P048      
2428766324  P005    P024      
2428766324  P005    NULL

我想取2428766324 p005空

nbnkbykc

nbnkbykc1#

使用 row_number() :

select *
from (select t.*,
             row_number() over (partition by id
                                order by (case when llvc = lp then 1 
                                               when lp is null then 2
                                               else 3
                                          end)
                                ) as seqnum
     ) t
where seqnum = 1;

相关问题