按bigquery中列段的最大值返回列名

gudnpqoy  于 2021-07-24  发布在  Java
关注(0)|答案(3)|浏览(559)

我的查询应该根据列的最大值返回列名。我试着用case来解决这个案子。但不知怎的,出现了这个错误,可能是因为我返回的是列的名称,而不是列本身的值:

No matching signature for operator CASE; all THEN/ELSE arguments must be coercible to a common type but found: INT64, STRING; actual argument types (WHEN THEN) ELSE: (BOOL STRING) (BOOL INT64) INT64 at [3:5]

我的代码是:

SELECT
ID,
    CASE
        WHEN col1 >= col2 AND col1 >= col3 AND col1 >= col4 AND col1 >= col5 THEN 'col1 '
        WHEN col2 >= col1 AND col2 >= col3 AND col2 >= col4 AND col2 >= col5 THEN 'col2 '
        ELSE 'col1'                                
    END AS Max_Column_Name
FROM table

示例输入为:

有没有办法让这个查询更简单?所以如果有很多列,就不需要重复when..case。另一个倡议是像本文一样使用greatest(col1,col2,col3),但我不知道如何在标准sql bigquery中使用它。
预期产量:
返回对每个段(国家/地区、产品、语言)具有最大值的列的名称,并重命名结果表中列的名称。

wkftcu5l

wkftcu5l1#

假设你没有 NULL 值,可以使用 greatest() :

select (case greatest(country_uk, country_us)
            when country_uk then 'uk' when country_us then 'us'
        end), 
       . . .

也可以使用数组:

select (select el.what
        from unnest(array['uk' as what, country_uk as val), ('us', country_us)]) el
        order by el.val desc
        limit 1
       ) as country
bweufnob

bweufnob2#

下面是bigquery标准sql
它在某种程度上是非常通用的-如果您有更多或不同的段和它们的名称-您只需要在大多数外部选择中反映它-在像这样的行中( MAX(IF(segment = 'SegmentName', winner, NULL)) AS SegmentName ),其余部分在内部查询中处理


# standardSQL

SELECT id,
  MAX(IF(segment = 'Country', winner, NULL)) AS Country,
  MAX(IF(segment = 'Product', winner, NULL)) AS Product,
  MAX(IF(segment = 'Lang', winner, NULL)) AS Lang
FROM (
  SELECT id, segment, ARRAY_AGG(item ORDER BY col_value DESC LIMIT 1)[OFFSET(0)] winner
  FROM `project.dataset.table` t, 
  UNNEST(ARRAY(
      SELECT AS STRUCT segment, item, SAFE_CAST(col_value AS INT64) AS col_value
      FROM UNNEST(REGEXP_EXTRACT_ALL(TO_JSON_STRING(t), r'"\w+":[^,}]+')) kv,
      UNNEST([STRUCT(SPLIT(kv,':')[OFFSET(0)] AS col_name, SPLIT(kv,':')[OFFSET(1)] AS col_value)]) nv,
      UNNEST([STRUCT(SPLIT(TRIM(col_name, '"'), '_')[OFFSET(0)] AS segment, SPLIT(TRIM(col_name, '"'), '_')[OFFSET(1)] AS item)])
      WHERE TRIM(col_name, '"') != 'id'
    )) kv
  GROUP BY id, segment
)
GROUP BY id

如果要应用于样本数据,如您的问题-输出是

Row id  Country Product Lang     
1   abc US      A       EN   
2   def UK      B       PH

注意:最近引入的execute immediate将允许您进一步概括上述解决方案,因此您甚至不必担心最外层select中的行—我将把它留给您,因为它完全超出了主要问题的范围

00jrzges

00jrzges3#

SELECT TOP 1 T.Col
FROM 
(
   SELECT Country_UK AS colMax,'Country_UK' Col FROM table 
    UNION ALL              
   SELECT Country_US AS colMax,'Country_US' Col FROM table 
    UNION ALL              
   SELECT Product_A AS colMax,'Product_A' Col FROM table 
    UNION ALL             
   SELECT Product_B AS colMax,'Product_B' Col FROM table 
    UNION ALL             
   SELECT Lang_EN AS colMax,'Lang_EN' Col FROM table 
) AS T 
GROUP BY T.Col 
Order by MAX(T.colMax) DESC

相关问题