hive 如何在SQL中使用groupby并在列(类别或项目类型)中聚合每个值的计数?

lnvxswe2  于 2023-06-29  发布在  Hive
关注(0)|答案(3)|浏览(211)

我有一个这样的表:
| 项目| item |
| - -----| ------------ |
| 服装| dress |
| 服装| dress |
| 裤子| pants |
| 内衣| undies |
| 内衣| undies |
| 服装| dress |
我想要一个表,它计算每个customer_num在每个项目类型下进行的交易数量:
| 服装|裤子|内衣| undies |
| - -----|- -----|- -----| ------------ |
| 一个|一个|一个| 1 |
| 2| 0| 0| 0 |
| 0| 0|一个| 1 |
到目前为止,我已经使用了group by函数来统计每个客户的交易数量。但这只给了我每个customer_num的交易,而不是按项目类型隔离:

SELECT customer_num, count(*) from table GROUP BY customer_num

我进一步尝试在group by中使用多个列名,但它只是输出每个customer_num和item对的计数。如何生成类似于上面示例的输出,其中列是项,行是唯一的customer_num?

67up9zun

67up9zun1#

我推荐条件聚合:

select customer_num,
       sum(case when item_type = 'dress' then 1 else 0 end) as dress,
       sum(case when item_type = 'pants' then 1 else 0 end) as pants,
       sum(case when item_type = 'undies' then 1 else 0 end) as undies
from t
group by customer_num;
2w3kk1z5

2w3kk1z52#

您可以简单地枢轴表如下:

SELECT * FROM
(
SELECT customer_num,item from table
)
PIVOT
(
COUNT(item)
FOR item in ('dress','pants','undies')
)
ORDER BY customer_num;
af7jpaap

af7jpaap3#

使用子查询并在customer_num上连接

SELECT  
    customer_num,dress, pant, ...
FROM
    (SELECT 
         customer_num, COUNT(*) AS dress 
     FROM 
         table 
     WHERE 
         item_type = 'dress' 
     GROUP BY 
         customer_num) r1
JOIN 
    (SELECT 
         customer_num, COUNT(*) AS pant 
     FROM 
         table  
     WHERE 
         item_type = 'pan'  
     GROUP BY 
         customer_num) r2 USING (customer_num)
JOIN ...

相关问题