hive,如何从数组中获取元素,元素本身也是数组

cidc1ykv  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(448)

我有一个数据库表,它有一列存储json格式的字符串。字符串本身包含像数组一样的多元组元素。每个元素包含多个键值对。一些值也可能包含多个键值对,例如下面的“address”属性。

People table:
  Col1      Col2   .....   info
  aaa       bbb           see below

对于“info”列,它包含以下json格式字符串:

[{"name":"abc", 
  "address":{"street":"str1", "city":"c1"},
  "phone":"1234567"
 },
 {"name":"def", 
  "address":{"street":"str2", "city":"c1", "county":"ct"},
  "phone":"7145895"
 }
]

我需要得到json字符串中每个字段的单个值。我可以通过调用explode()对除“address”字段以外的所有字段执行此操作,如下所示:

SELECT  
   get_json_object(person, '$.name') AS name,
   get_json_object(person, '$.phone') AS phone,
   get_json_object(person, '$.address') AS addr
 FROM people lateral view explode(split(regexp_replace(
      regexp_replace(info, '\\}\\,\\{', '\\}\\\\n\\{' ), '\\[|\\]',''), '\\\\n')) 
      p as person;

我的问题是如何得到“address”字段中的每个字段。“address”字段可以包含任意数量的键值对,我不能使用jsonserde。我正在考虑使用另一个explode()调用,但我无法让它工作。有人能帮忙吗。非常感谢。

4xrmg8kj

4xrmg8kj1#

您可以直接使用

SELECT  
  get_json_object(person, '$.name') AS name,
  get_json_object(person, '$.phone') AS phone,
  get_json_object(person, '$.address.street') AS street,
  get_json_object(person, '$.address.city') AS city,
  get_json_object(person, '$.address.county') AS county,      
FROM people lateral view explode(split(regexp_replace(
  regexp_replace(info, '\\}\\,\\{', '\\}\\\\n\\{' ), '\\[|\\]',''), '\\\\n')) 
  p as person;

相关问题