hive 如何获取每个任务的开始和结束时间?

niwlg2el  于 2023-04-30  发布在  Hive
关注(0)|答案(1)|浏览(305)

如何在SQL代码中通过每个emp获取每个任务的开始和结束时间。

emp_id  task    timestamp
100 A   15/04/2023 02:01
100 A   15/04/2023 02:06
100 A   15/04/2023 02:17
100 B   15/04/2023 02:24
100 B   15/04/2023 02:34
100 A   16/04/2023 10:34
100 A   16/04/2023 10:36
100 A   16/04/2023 10:39
101 A   16/04/2023 20:34
101 A   16/04/2023 20:36

输出:

id  task    Start                     End
100 A   15/04/2023 02:01    15/04/2023 02:17
100 B   15/04/2023 02:24    15/04/2023 02:34
100 A   15/04/2023 10:34    16/04/2023 10:39
101 A   16/04/2023 20:34    16/04/2023 20:36
vlf7wbxs

vlf7wbxs1#

使用ROW_NUMBER函数获取START,LAST_VALUE获取END。

SELECT A.EMP_ID,A.TASK,A.RNO,TIMESTAMP AS [START],
      (SELECT LAST_VALUE(MAX(TIMESTAMP)) OVER (
        PARTITION BY EMP_ID,TASK,CONVERT(DATE,TIMESTAMP) 
        ORDER BY EMP_ID,TASK,CONVERT(DATE,TIMESTAMP)
        RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
    FROM TABLENAME T WHERE T.EMP_ID=A.EMP_ID AND T.TASK = A.TASK 
         AND CAST(T.TIMESTAMP AS DATE)=CAST(A.TIMESTAMP AS DATE)
    GROUP BY EMP_ID,TASK,CAST(T.TIMESTAMP AS DATE)) AS [END] 
    FROM (
        SELECT EMP_ID,TASK,
        ROW_NUMBER() OVER(PARTITION BY EMP_ID,TASK,CAST(TIMESTAMP AS DATE) 
        ORDER BY EMP_ID,TASK,CAST(TIMESTAMP AS DATE)) AS RNO,
        TIMESTAMP FROM TABLENAME
    )A
    WHERE A.RNO = 1

结果:

相关问题