在大查询中从复杂JSON中提取数据

ifmq2ha2  于 2022-11-26  发布在  其他
关注(0)|答案(1)|浏览(168)

我有一个数据看起来像这样-

{aa_validation: null 
 propensity_overlap: {auc pscore overlap: 0.5993614555297898 
                      auc pscore treated: 1.000000000000001 
                      auc pscore control: 1.0000000000000004
                      auc pscore ROC: 0.7524618788923345} 
 feature_balance: {% features with post matching SMD < 0.1: 100.0 
                   % features with post matching SMD < 0.25: 100.0 
                   % features with SMD improved after matching: 84.21052631578947 
                   % features with SMD not significantly worsened: 100.0}}

我想使用Big Query为每个键创建一个列,例如,我得到的结果如下所示:

auc pscore overlap   auc pscore overlap...   % features with post matching SMD < 0.1   % features with post matching SMD < 0.25 ....

      0.32                    1                        50.0                      50.0

我一直在疯狂地使用Regex_extract,但似乎不能使它工作。有人能帮助我使用Bigquery提取这个吗?

lhcgjxsq

lhcgjxsq1#

此JSON架构不适用于BigQuery。您需要更改键以便能够正确提取它们。
这个关键字,例如,“% features with post matching SMD〈0.1”不适用于JSON_EXTRACT函数,如您所见:
invalid key on sample query
使用不同的键,然后您将能够启动如下查询:

SELECT JSON_EXTRACT(PARSE_JSON(json_field), "$.aa_validation") AS aa_validation,
JSON_EXTRACT(PARSE_JSON(json_field), "$.feature_balance") AS feature_balance,
JSON_EXTRACT(PARSE_JSON(json_field), "$.feature_balance.features_with_smd_improved_after_matching") AS smd_improved_after_matching,
FROM `qwiklabs-gcp-03-5570739e32d7.data.test2`

将PARSE_JSON和JSON_EXTRACT与JsonPath查询结合使用。

相关问题