查找总视图时间

h6my8fg2  于 2021-07-24  发布在  Java
关注(0)|答案(1)|浏览(258)

我需要获取示例数据的用户/集的总观看时间。

user_id,episode_name,start_time,end_time
101,Movie1,1,10
101,Movie1,5,15
101,Movie1,18,23
102,Movie1,1,10
102,Movie1,5,15
102,Movie1,18,23

方法是将开始时间和结束时间作为一个范围,然后使用explode。但它不起作用。

Expected Solution:
user_id,episode_name,total_hours
101,Movie1,19
102,Movie1,19
d5vmydt9

d5vmydt91#

这是一种群岛问题。我将建议一个累积最大值方法来确定这些岛屿的起点。这样做的目的是确定岛屿,然后总结其价值:

select user_id, episode_name, min(start_time), max(end_time)
from (select t.*,
             sum(case when prev_end_time >= start_time then 0 else 1 end) over
                 (partition by user_id, episode_name order by start_time) as grp
      from (select t.*,
                   max(end_time) over (partition by user_id, episode_name
                                       order by start_time
                                       rows between unbounded preceding and 1 preceding
                                      ) as prev_end_time
            from t
           ) t
     ) t
group by user_id, episode_name, grp;

要获得总数,可以使用附加的聚合级别:

with cte as (
      select user_id, episode_name, min(start_time) as start_time, max(end_time) as end_time
      from (select t.*,
                   sum(case when prev_end_time >= start_time then 0 else 1 end) over
                       (partition by user_id, episode_name order by start_time) as grp
            from (select t.*,
                         max(end_time) over (partition by user_id, episode_name
                                             order by start_time
                                             rows between unbounded preceding and 1 preceding
                                            ) as prev_end_time
                  from t
                 ) t
           ) t
      group by user_id, episode_name, grp
     )
select user_id, episode_name, sum(end_time - start_time)
from cte
group by user_id, episode_name;

相关问题