sql—重新编写此查询时需要指导吗

polkgigr  于 2021-06-24  发布在  Hive
关注(0)|答案(1)|浏览(275)

我们运行这个查询来生成日历周数据,这个查询两次命中同一个视图。由于缺少join-on子句,可能会创建笛卡尔积。
是否有任何方法可以以最佳方式重新编写此查询。

SELECT cal_date,
       regexp_replace(cal_date, '-', '') AS PC_cal_date,
       year_num*100+week_num AS year_week_num,
       CASE
           WHEN year_num*100+pd_num IN (min_year_pd_num, max_year_pd_num) THEN 'A'
           ELSE 'B'
       END AS yr_pd_ind,
       year_num*100+pd_num AS yr_pd_num,
       dense_rank() OVER (ORDER BY year_num*100+week_num DESC) AS wk_index,
                         dense_rank() OVER (ORDER BY year_num*100+pd_num DESC) AS pd_index
FROM mstr_v.local_cal_date t1,

  (SELECT max(year_num*100+pd_num) max_year_pd_num,
          min(year_num*100+pd_num) min_year_pd_num
   FROM mstr_v.local_cal_date
   WHERE cal_date IN (date(date_sub(CURRENT_DATE, cast(date_format(CURRENT_DATE, 'u') AS int)+105*7+1)),
                      date(date_sub(CURRENT_DATE, cast(date_format(CURRENT_DATE, 'u') AS int)))) ) t2
WHERE cal_date BETWEEN date(date_sub(CURRENT_DATE, cast(date_format(CURRENT_DATE, 'u') AS int)+105*7)) 
AND date(date_sub(CURRENT_DATE, cast(date_format(CURRENT_DATE, 'u') AS int)+1))
wh6knrhe

wh6knrhe1#

如果将当前在t2子查询中的where子句移到case语句中,并使用over()计算min和max,则t2子查询中计算的列可以在不进行第二次表扫描的情况下进行计算(在同一子查询中):

SELECT cal_date,
       regexp_replace(cal_date, '-', '') AS PC_cal_date,
       year_num*100+week_num AS year_week_num,
       CASE
           WHEN year_num*100+pd_num IN (min_year_pd_num, max_year_pd_num) THEN 'A'
           ELSE 'B'
       END AS yr_pd_ind,
       year_num*100+pd_num AS yr_pd_num,
       dense_rank() OVER (ORDER BY year_num*100+week_num DESC) AS wk_index,
                         dense_rank() OVER (ORDER BY year_num*100+pd_num DESC) AS pd_index
FROM (select t1.*,            
             max(case when cal_date IN (date(date_sub(CURRENT_DATE, cast(date_format(CURRENT_DATE, 'u') AS int)+105*7+1)),
                                        date(date_sub(CURRENT_DATE, cast(date_format(CURRENT_DATE, 'u') AS int))))        
                      then  year_num*100+pd_num end) over() as max_year_pd_num,
             min(case when cal_date IN (date(date_sub(CURRENT_DATE, cast(date_format(CURRENT_DATE, 'u') AS int)+105*7+1)),
                                        date(date_sub(CURRENT_DATE, cast(date_format(CURRENT_DATE, 'u') AS int)))) 
                      then year_num*100+pd_num end) over() as min_year_pd_num
      from mstr_v.local_cal_date t1
)t1
WHERE cal_date BETWEEN date(date_sub(CURRENT_DATE, cast(date_format(CURRENT_DATE, 'u') AS int)+105*7)) 
AND date(date_sub(CURRENT_DATE, cast(date_format(CURRENT_DATE, 'u') AS int)+1))

相关问题