配置单元窗口附加输出

mv1qrgav  于 2021-06-27  发布在  Hive
关注(0)|答案(1)|浏览(246)

给出以下数据:

CREATE TABLE dat (dt STRING, uxt BIGINT, temp FLOAT, city STRING);
INSERT INTO dat VALUES
('1/1/2000 0:53', 946687980, 100, 'A'),
('1/1/2000 0:59', 946688340, 28.9, 'A'),
('1/1/2000 13:54', 946734840, -1, 'A'),
('1/1/2000 13:55', 946734900, 30.9, 'A'),
('1/1/2000 22:53', 946767180, 30.9, 'A'),
('1/1/2000 22:59', 946767540, 30, 'A'),
('1/2/2000 1:25', 946776300, 121, 'A'),
('1/2/2000 1:53', 946777980, 28.9, 'A'),
('1/2/2000 2:53', 946781580, 28.9, 'A'),
('1/3/2000 1:53', 946864380, 10, 'A'),
('1/3/2000 11:20', 946898400, 15.1, 'A'),
('1/3/2000 11:53', 946900380, 18, 'A'),
('1/3/2000 21:00', 946933200, 17.1, 'A'),
('1/3/2000 21:53', 946936380, 16, 'A');

我正在使用一些窗口功能查找每24小时的maxtemp、mintemp等:

select dt, uxt, maxtemp, mintemp, ABS(maxtemp - mintemp) as tempDiff, city
from(
select dt, uxt, max(temp) over (w) as maxtemp, min(temp) over (w) as 
mintemp, city
from dat
WINDOW w as (partition by city order by uxt range between CURRENT ROW and 
86400 FOLLOWING))t1
order by tempDiff desc;

这将提供以下输出(第一行):

dt                   uxt      maxtemp   mintemp tempdiff    city
2000-01-01 13:54    946734840   121.0   -1.0    122.0        A

我想添加'dt'的时候,maxtemp发生的输出,并努力找到一个解决方案。
输出的第一行如下所示:

dt                   uxt      maxtemp   mintemp tempdiff    city   maxdt   
2000-01-01 13:54    946698780   121        -1    122         A     '2000-01-02 01:25'

使用第一个\u值查询:

select dt
  ,uxt
  ,max(temp) over w as maxtemp
  ,min(temp) over w as mintemp
  ,abs(max(temp) over w - min(temp) over w) as tempDiff
  ,first_value(dt) over (w order by temp desc) as maxdt
  ,city
from dat
order by tempDiff desc
WINDOW w as (partition by city order by uxt 
         range between CURRENT ROW and 86400 FOLLOWING);

正在生成此输出(前两行):

dt               uxt        maxtemp mintemp tempdiff maxdt          city
1/1/2000 0:59    946688340  121.0   -1.0    122.0    1/2/2000 1:53  A
1/1/2000 0:53    946687980  121.0   -1.0    122.0    1/1/2000 0:53   A

它给出的maxtemp不在24小时日期范围内。另外,第二行1/1/2000 0:53的温度不是121.0

rqcrx0a6

rqcrx0a61#

这可以通过包括以下内容来实现 first_value 内部查询中的窗口函数。

first_value(dt) over (partition by city order by uxt,temp desc 
                      range between CURRENT ROW and 7200 FOLLOWING)

另请注意,查询可以简化为(消除子查询,因为在这种情况下不需要子查询)

select dt
      ,uxt
      ,max(temp) over w as maxtemp
      ,min(temp) over w as mintemp
      ,abs(max(temp) over w - min(temp) over w) as tempDiff
      ,first_value(dt) over (w order by temp desc) as maxdt
      ,city
from dat
WINDOW w as (partition by city order by uxt 
             range between CURRENT ROW and 7200 FOLLOWING)

相关问题