hive tpcds query30“只允许作为顶级连接的子查询表达式”

e5nszbig  于 2021-06-24  发布在  Hive
关注(0)|答案(1)|浏览(746)

我在配置单元中尝试运行tpcds查询30时遇到上述错误。我做了研究,知道这是不允许在Hive,所以我想知道如何重写这个查询。我直接从这个网站上得到的。http://www.tpc.org/tpcds/default5.asp Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 Unsupported SubQuery Expression 'ctr_state': Only SubQuery expressions that are top level conjuncts are allowed ###查询30

with customer_total_return as
 (select wr_returning_customer_sk as ctr_customer_sk
        ,ca_state as ctr_state, 
   sum(wr_return_amt) as ctr_total_return
 from web_returns
     ,date_dim
     ,customer_address
 where wr_returned_date_sk = d_date_sk 
   and d_year =2000
   and wr_returning_addr_sk = ca_address_sk 
 group by wr_returning_customer_sk
         ,ca_state)
  select  c_customer_id,c_salutation,c_first_name,c_last_name,c_preferred_cust_flag
       ,c_birth_day,c_birth_month,c_birth_year,c_birth_country,c_login,c_email_address
       ,c_last_review_date_sk,ctr_total_return
 from customer_total_return ctr1
     ,customer_address
     ,customer
 where ctr1.ctr_total_return > (select avg(ctr_total_return)*1.2
        from customer_total_return ctr2 
                      where ctr1.ctr_state = ctr2.ctr_state)
       and ca_address_sk = c_current_addr_sk
       and ca_state = 'GA'
       and ctr1.ctr_customer_sk = c_customer_sk
 order by c_customer_id,c_salutation,c_first_name,c_last_name,c_preferred_cust_flag
                  ,c_birth_day,c_birth_month,c_birth_year,c_birth_country,c_login,c_email_address
                  ,c_last_review_date_sk,ctr_total_return
limit 100;

更新

当您使用tpcds套件生成查询时,查询30可能有一个输入错误。这在customer表中不存在 c_last_review_date_sk 你需要把它改成 c_last_review_date

vnzz0bqm

vnzz0bqm1#

计算 avg(ctr_total_return) 在子查询中 customer_total_return 使用解析函数并从 WHERE :

with customer_total_return as
(
select ctr_customer_sk, ctr_state, ctr_total_return,
       avg(ctr_total_return) over(partition by ctr_state ) as ctr_state_avg
from
 (select wr_returning_customer_sk as ctr_customer_sk
        ,ca_state as ctr_state, 
   sum(wr_return_amt) as ctr_total_return
 from web_returns
     ,date_dim
     ,customer_address
 where wr_returned_date_sk = d_date_sk 
   and d_year =2000
   and wr_returning_addr_sk = ca_address_sk 
 group by wr_returning_customer_sk
         ,ca_state
) s
)

  select  c_customer_id,c_salutation,c_first_name,c_last_name,c_preferred_cust_flag
       ,c_birth_day,c_birth_month,c_birth_year,c_birth_country,c_login,c_email_address
       ,c_last_review_date_sk,ctr_total_return
 from customer_total_return ctr1
     ,customer_address
     ,customer
 where ctr1.ctr_total_return > ctr1.ctr_state_avg*1.2
       and ca_address_sk = c_current_addr_sk
       and ca_state = 'GA'
       and ctr1.ctr_customer_sk = c_customer_sk
 order by c_customer_id,c_salutation,c_first_name,c_last_name,c_preferred_cust_flag
                  ,c_birth_day,c_birth_month,c_birth_year,c_birth_country,c_login,c_email_address
                  ,c_last_review_date_sk,ctr_total_return
limit 100;

相关问题