结构列配置单元上的聚合

vmdwslir  于 2021-06-01  发布在  Hadoop
关注(0)|答案(1)|浏览(335)

我有一个struct数组,我试图找到count,sum,struct列的不同值。

create table temp (regionkey smallint, name string, comment string, nations array<struct<n_nationkey:smallint,n_name:string,n_comment:string>>) 
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY '|' 
MAP KEYS TERMINATED BY ',';

当我尝试运行查询时

select name, 
count(nations.n_nationkey) as count, 
sum(nations.n_nationkey) as sum, 
ndv(nations.n_nationkey) as distinct_val 
from temp 
group by name 
order by name;

我得到了错误

FAILED: UDFArgumentTypeException Only primitive type arguments are accepted but array<smallint> is passed.

我要做的是找到nèu nationkey的count、sum和distinct值。
任何帮助都将不胜感激。

q9rjltbz

q9rjltbz1#

select      t.name 
           ,count   (e.col.n_nationkey)             as count 
           ,sum     (e.col.n_nationkey)             as sum
           ,count   (distinct e.col.n_nationkey)    as distinct_val 

from        temp t lateral view explode (t.nations) e

group by    t.name 

order by    t.name
;

为了这次行动

使用别名的相同解决方案。 nations 不是结构。它是一个结构数组。
它没有一个 n_nationkey 属性。它的结构元素 n_nationkey 属性。
这个 explode 函数接受一个结构数组( nations )并返回每个结构( nation )在另一行。

select      t.name 
           ,count   (e.nation.n_nationkey)             as count 
           ,sum     (e.nation.n_nationkey)             as sum
           ,count   (distinct e.nation.n_nationkey)    as distinct_val 

from        temp t lateral view explode (t.nations) e as nation

group by    t.name 

order by    t.name
;

相关问题