按不同列、配置单元对数组顺序进行排序

ogsagwnx  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(424)

我有两个专栏,一个是产品,一个是购买日期。我可以通过应用sort\u array(dates)函数对日期进行排序,但我希望能够按购买日期对\u array(products)进行排序。有没有办法在Hive里做到这一点?
表名为

ClientID    Product    Date
100    Shampoo    2016-01-02
101    Book    2016-02-04
100    Conditioner    2015-12-31
101    Bookmark    2016-07-10
100    Cream    2016-02-12
101    Book2    2016-01-03

然后,为每个客户提供一行:

select
clientID,
COLLECT_LIST(Product) as Prod_List,
sort_array(COLLECT_LIST(date)) as Date_Order
from tablename
group by 1;

作为:

ClientID    Prod_List    Date_Order
100    ["Shampoo","Conditioner","Cream"]    ["2015-12-31","2016-01-02","2016-02-12"]
101    ["Book","Bookmark","Book2"]    ["2016-01-03","2016-02-04","2016-07-10"]

但我想要的是产品的顺序与正确的购买顺序相联系。

js81xvg6

js81xvg61#

只使用内置函数是可以做到的,但它不是一个漂亮的站点:-)

select      clientid
           ,split(regexp_replace(concat_ws(',',sort_array(collect_list(concat_ws(':',cast(date as string),product)))),'[^:]*:([^,]*(,|$))','$1'),',') as prod_list
           ,sort_array(collect_list(date)) as date_order

from        tablename 

group by    clientid
;
+----------+-----------------------------------+------------------------------------------+
| clientid |             prod_list             |                date_order                |
+----------+-----------------------------------+------------------------------------------+
|      100 | ["Conditioner","Shampoo","Cream"] | ["2015-12-31","2016-01-02","2016-02-12"] |
|      101 | ["Book2","Book","Bookmark"]       | ["2016-01-03","2016-02-04","2016-07-10"] |
+----------+-----------------------------------+------------------------------------------+

相关问题