在bigquery中取消json字符串化数组

puruo6ea  于 2021-07-29  发布在  Java
关注(0)|答案(2)|浏览(647)

我在google bigquery中有下表:

+------------+---------+---------+
|    Name    |  City  | items   |
+------------+---------+
| James     | Dallas   |[{'text': 'pear', 'line_total_excl_vat': '24','product_id': 100}]

| John      | Chicago  |[{'text': 'apple', 'line_total_excl_vat': '29','product_id': 200},{'text': 'banana', 'line_total_excl_vat': '34','product_id': 300}]
+------------+---------+

我正在努力实现这样的目标:

+------------+---------+---------+----------------------+--------------+
|    Name    |  City   | text     |  line_total_excl_vat | product_id
+------------+---------+
| James     | Dallas   |  pear    |       24             |       100

| John      | Chicago  |  apple   |       29             |       200

| John      | Chicago  |  banana  |       34             |       300
+------------+---------+

“items”列实际上是一个字符串。有没有办法取消这个数据格式并在bigquery中实现我想要的视图?谢谢!

lkaoscv7

lkaoscv71#

下面是bigquery标准sql


# standardSQL

SELECT Name, City, 
  JSON_EXTRACT_SCALAR(json, '$.text') AS text,
  JSON_EXTRACT_SCALAR(json, '$.line_total_excl_vat') AS line_total_excl_vat,
  JSON_EXTRACT_SCALAR(json, '$.product_id') AS product_id
FROM `project.dataset.table`,
UNNEST(JSON_EXTRACT_ARRAY(items,'$')) json

如果要应用于您问题中的样本数据-如下面的示例所示


# standardSQL

WITH `project.dataset.table` AS (
  SELECT 'James' AS Name, 'Dallas' AS City, "[{'text': 'pear', 'line_total_excl_vat': '24','product_id': 100}]" AS items UNION ALL
  SELECT 'John', 'Chicago', "[{'text': 'apple', 'line_total_excl_vat': '29','product_id': 200},{'text': 'banana', 'line_total_excl_vat': '34','product_id': 300}]"
)
SELECT Name, City, 
  JSON_EXTRACT_SCALAR(json, '$.text') AS text,
  JSON_EXTRACT_SCALAR(json, '$.line_total_excl_vat') AS line_total_excl_vat,
  JSON_EXTRACT_SCALAR(json, '$.product_id') AS product_id
FROM `project.dataset.table`,
UNNEST(JSON_EXTRACT_ARRAY(items,'$')) json

输出为

Row Name    City    text    line_total_excl_vat product_id   
1   James   Dallas  pear    24                  100  
2   John    Chicago apple   29                  200  
3   John    Chicago banana  34                  300
vuv7lop3

vuv7lop32#

这有点像摆弄json_extract和json_extract_数组与unnest()的结合。。。

WITH t AS (
  SELECT 'James' as Name, 'Dallas' AS City, "[{'text': 'pear', 'line_total_excl_vat': '24','product_id': 100}]" AS items
  UNION ALL
  SELECT 'John', 'Chicago', "[{'text': 'apple', 'line_total_excl_vat': '29','product_id': 200},{'text': 'banana', 'line_total_excl_vat': '34','product_id': 300}]"
)

SELECT 
  # we'll unnest this array in the next statement and grab its elements
  JSON_EXTRACT_ARRAY(items,'$') as arr

  # unnest() turns array into table format - jason-function extracts fields from each row
  ,ARRAY(SELECT AS STRUCT

      JSON_EXTRACT_SCALAR(i,'$.text') as text,
      JSON_EXTRACT_SCALAR(i,'$.line_total_excl_vat') as line_total_excl_vat,
      JSON_EXTRACT_SCALAR(i,'$.product_id') as product_id

   FROM UNNEST(JSON_EXTRACT_ARRAY(items,'$')) as i 
   ) AS unnested_items
   ,* # original fields for reference
FROM t

这将创建一个嵌套的输出,您可以稍后使用它(请参阅输出的json表示,这一点更为清楚)-如果您想展平表,您可以横向连接这个结果数组。。。

WITH t AS (

# Name    |  City  | items   |

  SELECT 'James' as Name, 'Dallas' AS City, "[{'text': 'pear', 'line_total_excl_vat': '24','product_id': 100}]" AS items
  UNION ALL
  SELECT 'John', 'Chicago', "[{'text': 'apple', 'line_total_excl_vat': '29','product_id': 200},{'text': 'banana', 'line_total_excl_vat': '34','product_id': 300}]"
)

SELECT 
   * 
FROM t CROSS JOIN UNNEST(ARRAY((SELECT AS STRUCT

      JSON_EXTRACT_SCALAR(i,'$.text') as text,
      JSON_EXTRACT_SCALAR(i,'$.line_total_excl_vat') as line_total_excl_vat,
      JSON_EXTRACT_SCALAR(i,'$.product_id') as product_id

   FROM UNNEST(JSON_EXTRACT_ARRAY(items,'$')) as i 
   )))

相关问题