需要有关配置单元查询的帮助吗

l7wslrjt  于 2021-06-26  发布在  Hive
关注(0)|答案(1)|浏览(325)
hive> describe hivecustomers;
OK
cust_id                 int                                         
cust_fname              string                                      
cust_lname              string                                      
cust_email              string                                      
cust_password           string                                      
cust_street             string                                      
cust_city               string                                      
cust_state              string                                      
cust_zipcode            string   

hive> describe hiveorders;
OK
ord_id                  int                                         
ord_dt                  string                                      
ord_cust_id             int                                         
ord_stat                string      

hive> select * from hiveorders limit 3; 

OK 
1 2013-07-25 00:00:00.0 11599 CLOSED 
2 2013-07-25 00:00:00.0 256 PENDING_PAYMENT 
3 2013-07-25 00:00:00.0 12111 COMPLETE 

hive> select * from hivecustomers limit 3; 
OK 
1 Richard Hernandez XXXXXXXXX XXXXXXXXX 6303 Heather Plaza Brownsville TX 78521 
2 Mary Barrett XXXXXXXXX XXXXXXXXX 9526 Noble Embers Ridge Littleton CO 80126 
3 Ann Smith XXXXXXXXX XXXXXXXXX 3422 Blue Pioneer Bend Caguas PR 00725

基于以上两个表,我需要输出如下在配置单元中,我如何才能写查询到这个工作?

+-----------+---------------+---------------+--------------+-----+ 
|Cust Name  | Cust Address  | Total Orders  | Order Status |Count|      
+-----------+---------------+---------------+--------------+-----+ 
|Andrew     |London         |15             |Complete      |8    |  
|Andrew     |London         |15             |Pending       |3    | 
|Andrew     |London         |15             |Processing    |4    | 
|Andrew     |London         |15             |On-Hold       |1    |
+-----------+---------------+---------------+--------------+-----+
holgip5t

holgip5t1#

最后将数据用作cte(公共表表达式)并对其进行了测试。cuctomer 1有两个待处理的付款订单,共4个,所有其他每个状态有一个订单,每个状态共3个:

with hivecustomers  as
(
select 1 as cust_id , 'Richard'  as cust_fname union all
select 2 as cust_id , 'Mary'     as cust_fname union all
select 3 as cust_id , 'Ann'      as cust_fname
),
hiveorders as
(
select 1 as ord_id, '2013-07-25 00:00:00.0' as ord_dt, 1 as ord_cust_id, 'CLOSED'          as ord_stat union all
select 2 as ord_id, '2013-07-25 00:00:00.0' as ord_dt, 1 as ord_cust_id, 'PENDING_PAYMENT' as ord_stat union all
select 3 as ord_id, '2013-07-25 00:00:00.0' as ord_dt, 1 as ord_cust_id, 'COMPLETE'        as ord_stat union all
select 4 as ord_id, '2013-07-25 00:00:00.0' as ord_dt, 1 as ord_cust_id, 'PENDING_PAYMENT'        as ord_stat union all
select 21 as ord_id, '2013-07-25 00:00:00.0' as ord_dt, 2 as ord_cust_id, 'CLOSED'          as ord_stat union all
select 22 as ord_id, '2013-07-25 00:00:00.0' as ord_dt, 2 as ord_cust_id, 'PENDING_PAYMENT' as ord_stat union all
select 23 as ord_id, '2013-07-25 00:00:00.0' as ord_dt, 2 as ord_cust_id, 'COMPLETE'        as ord_stat union all
select 31 as ord_id, '2013-07-25 00:00:00.0' as ord_dt, 3 as ord_cust_id, 'CLOSED'          as ord_stat union all
select 32 as ord_id, '2013-07-25 00:00:00.0' as ord_dt, 3 as ord_cust_id, 'PENDING_PAYMENT' as ord_stat union all
select 33 as ord_id, '2013-07-25 00:00:00.0' as ord_dt, 3 as ord_cust_id, 'COMPLETE'        as ord_stat
) 
select cust_fname, total_orders, order_status, count(*) Count 
from
(
select c.cust_fname, 
       count(*) over(partition by c.cust_id) as total_orders,
       o.ord_stat as Order_Status
  from hivecustomers c left join hiveorders o on c.cust_id=o.ord_cust_id
 )s
  group by cust_fname, total_orders, order_status

输出:

OK
cust_fname      total_orders    order_status    count
Ann     3       CLOSED  1
Ann     3       COMPLETE        1
Ann     3       PENDING_PAYMENT 1
Mary    3       CLOSED  1
Mary    3       COMPLETE        1
Mary    3       PENDING_PAYMENT 1
Richard 4       CLOSED  1
Richard 4       COMPLETE        1
Richard 4       PENDING_PAYMENT 2
Time taken: 40.692 seconds, Fetched: 9 row(s)

只需删除cte并使用普通表即可。

相关问题