我在下面执行简单的Spark-SQL代码Azure数据块。
val df2=spark.sql(
s"""
select
mbrgm.mbrgm_id as case_id,
case
when mbr_hist.meck is not null
and mbr_hist.efdt is not null
and mbr_hist.efdt <= mbr_pgm.credttm
and (
mbr_hist.exp_dt is null
or mbr_hist.exp_dt > mbrgm.creat_dttm
) then mbr_hist.meck
else mbrgm.facmbid
end as mb_fid,
.....
from
tempview1 mbrgm
left join left outer join tempview2 mbr_hist on (mbrgm.mrid = mbr_hist.mrid
and mbr_hist.efdt <= mbrgm.credttm
and mbr_hist.exdt > mbrgm.credttm
每次我执行我得到其他条件值为mb_req字段即mbrgm. facmbid.我已经检查了我的数据,并与逻辑比较.按照逻辑,它应该去然后条件.我认为,而比较mbr_hist.efdt <= mbr_pgm.credttm
它总是不正确.
我有mbr_hist.efdt作为字符串类型ex:2017-07-22 21:58:46和mbr_pgm.credttm作为时间戳ex:2011-08- 13 T11:00:00.910+0000。这是不是因为值的长度不同,我的比较失败。我可以使用什么来正确比较。
1条答案
按热度按时间gtlvzcf81#
Databricks不能直接将字符串与时间戳进行比较。您需要将字符串转换为时间戳。默认情况下,
cast
仅适用于ISO 8601 format中的字符串,因此您需要使用to_timestamp函数显式date/time pattern来进行转换。像