我删除了这篇文章的前一个版本,以代替这篇经过清理的文章,并提供了一个可复制的例子。我有一个如下格式的表格:
WITH wide_stats AS (
(
SELECT
'joe' name, 'bills' team,
struct(struct(7 as fga) as o, struct(8 as fga) as d) as t1,
struct(struct(3 as fga) as o, struct(9 as fga) as d) as t2,
struct(3 as pts, 9 as ast, 5 as reb) as t3,
7 tov, 3 blk
) UNION ALL (
SELECT 'nick' name, 'jets' team,
struct(struct(12 as fga) as o, struct(13 as fga) as d) as t1,
struct(struct(15 as fga) as o, struct(22 as fga) as d) as t2,
struct(13 as pts, 5 as ast, 15 as reb) as t3,
75 tov, 23 blk
)
)
SELECT
name, team, metric, SAFE_CAST(value AS FLOAT64) value
FROM (
SELECT
name, team,
REGEXP_REPLACE(SPLIT(pair, ':')[OFFSET(0)], r'^"|"$', '') metric,
REGEXP_REPLACE(SPLIT(pair, ':')[OFFSET(1)], r'^"|"$', '') value
FROM wide_stats,
UNNEST(SPLIT(REGEXP_REPLACE(to_json_string(wide_stats), r'{|}', ''))) pair
)
WHERE NOT LOWER(metric) IN ('name', 'team')
并且我正在努力将该表重新调整为以下输出:
name team metric value
joe bills t1_o_fga 7
joe bills t1_d_fga 8
joe bills t2_o_fga 3
joe bills t2_d_fga 9
joe bills t3_pts 3
joe bills t3_ast 9
joe bills t3_reb 5
joe bills tov 7
joe bills blk 3
nick jets t1_o_fga 12
nick jets t1_d_fga 13
nick jets t2_o_fga 15
nick jets t2_d_fga 22
nick jets t3_pts 13
nick jets t3_ast 5
nick jets t3_reb 15
nick jets tov 75
nick jets blk 23
这个任务很容易解释--从宽到长,但是在表中有struct
和嵌套的struct
。我在另一篇Stack Overflow帖子中的正则表达式工作是以错误的方式拆分列名,当前的输出与它所需要的不匹配。
行的顺序并不重要,对于变量名,不管是t1_o_fga还是t1-o-fga还是t1/o/fga,只要有一些分隔符,并且清楚地知道变量是什么就行了。
1条答案
按热度按时间rjee0c151#
下面是BigQuery标准SQL的示例
带输出
如果出于某种原因,您在手动组装
flat_stats
临时表时遇到问题-您可以使用以下小技巧步骤1-仅在传统模式下使用目标表[project:dataset.flat_stats]运行以下查询
令人惊讶的是,这将创建具有以下结构的表
[project:dataset.flat_stats]
第2步-之后,您只需运行下面的命令(现在使用标准SQL)