将格式化的日期字符串转换为配置单元中的时间戳,同时考虑到节省时间

wgeznvg7  于 2021-05-29  发布在  Hadoop
关注(0)|答案(0)|浏览(246)

我想用以下格式转换日期 '03/21/2006 10:00:00 PM' 配置单元中的时间戳数据类型。字符串来自芝加哥的时区,这意味着夏令时适用。在此过程中,我不想将时间转换为utc。
我试过以下方法,但是 unix_timestamp() 不了解时区 America/Chicago .

SELECT cast(
   from_unixtime(
     unix_timestamp('03/21/2006 10:00:00 PM America/Chicago', 'MM/dd/yyyy hh:mm:ss a zzzz')
   ) AS TIMESTAMP
);

我可以用 CDT 或者 CST 因为时区取决于日期是否在节光时间内,但这会变得混乱,因为时间变化的日期每年都不同。
有没有更好的办法?
我正在使用hive 1.1.0。
谢谢你抽出时间。
编辑
我已经能够实现我的目标使用下面的代码。这个解决方案既不简洁也不优雅,但它现在起作用了。一个udf可以以更模块化的方式完成同样的工作,但是我还没有任何编写hiveudf的经验。

,from_utc_timestamp(to_utc_timestamp(concat(
        regexp_extract(date, '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 3) -- year
        ,'-'
        ,regexp_extract(date,  '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 1) -- month
        ,'-'
        ,regexp_extract(date,  '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 2) -- day
        ,' '
        ,CASE  -- hour
            WHEN regexp_extract(date,  '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 6) = 'AM'   
                AND regexp_extract(date,  '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 4) = '12' 
                THEN '00'
            WHEN regexp_extract(date,  '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 6) = 'AM'   
                AND regexp_extract(date,  '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 4) <> '12' 
                THEN regexp_extract(date,  '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 4) 
            WHEN regexp_extract(date,  '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 6) = 'PM' 
                AND regexp_extract(date,  '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 4) = '12' 
                THEN '12'
            WHEN regexp_extract(date,  '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 6) = 'PM' 
                AND regexp_extract(date,  '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 4) <> '12'
                THEN cast(regexp_extract(date,  '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 4) AS TINYINT) + 12
            ELSE NULL -- should never get here
        END
        ,regexp_extract(date,  '(\\d{2})/(\\d{2})/(\\d{4}) (\\d{2})(:\\d{2}:\\d{2}) ([A|P]M)', 5) -- rest of time
        ) 
     , 'America/Chicago')
     , 'America/Chicago') AS cast_date

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题