hive用2个因子左连接,其中一个不相等,我希望列表在a中,而不是在b中

u4vypkhs  于 2021-06-26  发布在  Hive
关注(0)|答案(2)|浏览(335)

我想在Hive中达到这个效果:

select a.* from entry_data_fxj_cl a left join exit_data b 
    on trim(a.ecardid) = trim(b.ecardid) and abs(a.entrytime-b.entrytime)>60000
    where trim(b.ecardid) IS NULL

b、 entrytime应该与a.entrytime最接近的时间匹配
如何表达不平等?
如何表达?
谢谢你的回答。

pqwbnv8z

pqwbnv8z1#

解决方法是将非相等联接条件移动到 WHERE 并添加 OR IS NULL 允许左连接。请参见sql代码中的注解:

select * 
from
(--move non-equality condition to the where + OR is null to allow left join
select a.*, b.ecardid as b_ecardid 
  from entry_data_fxj_cl a left join exit_data b 
    on trim(a.ecardid) = trim(b.ecardid)
where abs(a.entrytime-b.entrytime)>60000 or b.ecardid is NULL --allow left join
)s 
where b_ecardid IS NULL --filter only rows for which b.ecardid is not found
yv5phkfx

yv5phkfx2#

我倾向于这样写:

select edf.*
from entry_data_fxj_cl edf
where not exists (select 1
                  from exit_data ed
                  where trim(ed.ecardid) = trim(edf.ecardid) and
                        ed.entrytime > edf.entrytime - 60000 and
                        ed.entrytime < edf.entrytime + 60000
                 );

你在 hive 里用吗?

相关问题