根据bigquery中的条件对列中的值集求和并折叠行

toiithl6  于 2021-08-13  发布在  Java
关注(0)|答案(2)|浏览(346)

我有一个在bigquery中转换表的查询。我有一个表,需要根据特定条件将某些行汇总到两列中。

我需要折叠occ=s1和mt=english的行,以便在代码、ur和gen之间进行唯一的组合。pct值应汇总为一行,如下所示-

yv5phkfx

yv5phkfx1#

下面是bigquery标准sql


# standardSQL

SELECT code, UR, gen, Occ, MT, SUM(pct) AS pct
FROM (
  SELECT * REPLACE(IF(collapse, 'S1', Occ) AS Occ, IF(collapse, 'English', MT) AS MT)
  FROM (
    SELECT *, 
      COUNTIF(Occ = 'S1' AND MT = 'English') OVER(PARTITION BY code, UR, gen) > 0 AS collapse
    FROM `project.dataset.table`
  )
)
GROUP BY code, UR, gen, Occ, MT

如果要应用到问题的样本数据-结果是

Row code    UR      gen Occ MT      pct  
1   128     Rural   F   D2  Ao      0.77     
2   128     Rural   F   D3  Ao      0.38     
3   128     Rural   M   D4  Ao      0.89     
4   128     Rural   M   D5  Ao      0.02     
5   128     Urban   F   S1  English 0.63     
6   128     Urban   M   S1  English 0.63
46qrfjad

46qrfjad2#

如果我理解正确,您可以将聚合与一些条件逻辑结合使用:

select code, ur, gen,
       coalesce((case when cnt_s1 is null then occ end), 'S1') as occ,
       mt,
       sum(pct) as pct
from (select t.*,
             countif( occ = 'S1' ) over (partition by code, ur, gen, mt) as cnt_s1
      from t
     ) t
group by code, ur, gen, occ, mt;

相关问题