(impala)在字段中选择最常用的值会导致“在选择列表中不支持子查询”

z0qdvdin  于 2021-06-26  发布在  Impala
关注(0)|答案(1)|浏览(515)

我正在尝试做一个聚合,它采用组中最常见的值,如下所示:

with t1 as (
    select
        id
        , colA
        , colB
    from some_Table
)
select 
    id
    , count(*) as total
    , max(colA) as maxColA
    , most_common(colB) -- this is what I'm trying to achieve
from t1
group by id

这就是我所尝试的:

with t1 as (
    select
        id
        , colA
        , colB
    from some_Table
)
select 
    id
    , count(*) as total
    , max(colA) as maxColA
    , (select colB, count(colB) as counts from t1 group by colB order by counts desc limit 1) as most_freq_colB_per_id
from t1
group by id

然而,它告诉我 AnalysisException: Subqueries are not supported in the select list . 我还能怎么做?

8xiog9wr

8xiog9wr1#

据我所知,impala并没有一个内置的聚合函数来计算模式(您试图计算的统计名称)。
您可以使用两个级别的聚合。你的cte什么都没做,所以你可以做:

select id, sum(total) as total, max(maxColA) as maxColA,
       max(case when seqnum = 1 then colB end) as mode
from (select id, colB, count(*) as total, max(colA) as maxColA,
             row_number() over (partition by id order by count(*) desc) as seqnum
      from sometable
      group by id, colb
     ) t
group by id;

相关问题