我有一张这样的table:
word nb_by_date.date nb_by_date.nb
---------------------------------------
abc 2020-01-01 17
2020-01-06 43
abc 2020-01-01 33
2020-01-05 12
2020-01-06 5
def 2020-01-02 11
2020-01-05 8
def 2020-01-02 1
您可以使用以下方法获取此示例:
WITH t AS (
SELECT "abc" AS word, [STRUCT('2020-01-01' AS date, 17 AS nb), STRUCT('2020-01-06' AS date, 43 AS nb)]
UNION ALL SELECT "abc" AS word, [STRUCT('2020-01-01' AS date, 33 AS nb), STRUCT('2020-01-05' AS date, 12 AS nb), STRUCT('2020-01-06' AS date, 5 AS nb)]
UNION ALL SELECT "def" AS word, [STRUCT('2020-01-02' AS date, 11 AS nb), STRUCT('2020-01-05' AS date, 8 AS nb)]
UNION ALL SELECT "def" AS word, [STRUCT('2020-01-02' AS date, 1 AS nb)]
)
我的目标是:
word nb_by_date.date nb_by_date.nb
---------------------------------------
abc 2020-01-01 50
2020-01-05 12
2020-01-06 55
def 2020-01-02 22
2020-01-05 8
我的尝试是:
SELECT
word,
ARRAY(
SELECT STRUCT(date, SUM(nb))
FROM UNNEST(nb_by_date)
GROUP BY date
ORDER BY date) nb_by_date
FROM (
SELECT word, ARRAY_CONCAT_AGG(nb_by_date) nb_by_date
FROM t
GROUP BY word
)
它适用于这个玩具的例子。但是,我有大量的数据和使用 ARRAY_CONCAT_AGG(nb_by_date)
创建超过100mb限制的行(无法查询大于100mb限制的行)。如何调整查询以使其即使在有大量数据的情况下也能工作?
1条答案
按热度按时间ygya80vv1#
可以使用两个聚合级别: