json Bigquery将表扁平化为具有重复值的结构/数组

5uzkadbs  于 2023-03-20  发布在  其他
关注(0)|答案(1)|浏览(112)

我在bigquery中有一个包含JSON字符串列的表,该表具有重复和嵌套的值。我希望转换为平面表,但可以以嵌套形式显示重复的值
JSON字符串

{
    "activity": "running",
    "device": "mobile",
    "dataset": [
        {
            "date": {
                "activity_date": "2023-03-13"
                
            },
            "value": {
                "heartrate": [
                    {
                        "max": 86,
                        "min": 30,
                        "name": "Normal"
                    },
                    {
                        "max": 121,
                        "min": 86,
                        "name": "high"
                    },
                    {
                        "max": 147,
                        "min": 121,
                        "name": "average"
                    },
                    {
                        "max": 220,
                        "min": 147,
                        "name": "Inrange"
                    }
                ]
            }
        }
    ]
}

它应该看起来像这样

activity    device  date          heartrate_max   heartrate_min heartrate_name
running     mobile  2023-03-13        86            30           normal
                                     121            86           high
                                     147            121          average
                                     220            147          Inrange

请告诉我怎么做,谢谢
取消嵌套并保留联接

ej83mcc0

ej83mcc01#

您可以考虑在BigQuery中使用JSON函数和 UNNEST 的以下传统方法。

-- sample table
WITH sample_data AS (
  SELECT """
    -- put your *json* here
  """ json
)

SELECT JSON_VALUE(json, '$.activity') activity,
       JSON_VALUE(json, '$.device') device,
       JSON_VALUE(dataset, '$.date.activity_date') date,
       (SELECT AS STRUCT
               ARRAY_AGG(JSON_VALUE(hr, '$.max') ORDER BY offset) heartrate_max,
               ARRAY_AGG(JSON_VALUE(hr, '$.min') ORDER BY offset) heartrate_min,
               ARRAY_AGG(JSON_VALUE(hr, '$.name') ORDER BY offset) heartrate_name
          FROM UNNEST(JSON_QUERY_ARRAY(dataset, '$.value.heartrate')) hr WITH offset
       ).*
  FROM sample_data, UNNEST(JSON_QUERY_ARRAY(json, '$.dataset')) dataset;

查询结果

相关问题