Python:将JSON结构(包括数组)转换为平面的、非规范化的结构

wz3gfoph  于 2022-12-24  发布在  Python
关注(0)|答案(1)|浏览(151)

给定一个JSON结构-包括不同级别上的数组,如下例所示:

{
  "listOfHouses": [
    {
      "name": "House Lannister",
      "listOfMembers": [
        {
          "firstName": "Tywin",
          "lastName": "Lannister"
        },
        {
          "firstName": "Cersei",
          "lastName": "Lannister"
        }
      ]
    },
    {
      "name": "House Targaryen",
      "listOfMembers": [
        {
          "firstName": "Daenerys",
          "lastName": "Targaryen"
        }
      ]
    }
  ]
}

我尝试了几种方法(例如,使用pd.json_normalize(...)或递归树算法),但没有一种方法返回在转换器结构中每个成员都有自己实体的结果。

{
  {
    'name': 'House Lannister',
    'firstName': 'Tywin',
    'lastName': 'Lannister'
  }, {
    'name': 'House Lannister',
    'firstName': 'Cersei',
    'lastName': 'Lannister'
  }, {
    'name': 'House Targaryen',
    'firstName': 'Daenerys',
    'lastName': 'Targaryen'
  }
}

这能在Python中以通用的方式实现吗?
尝试了以下两种方法,都没有达到我满意的效果:

import pandas as pd
import json

def flatten_json(y):
    out = {}

    def flatten(x, name=''):
        if type(x) is dict:
            for a in x:
                flatten(x[a], name + a + '_')
        elif type(x) is list:
            i = 0
            for a in x:
                flatten(a, name + str(i) + '_')
                i += 1
        else:
            out[name[:-1]] = x

    flatten(y)
    return out

if __name__ == '__main__':
    with open('got-houses.json') as json_file:
        data = json.load(json_file)
        normalized_json = pd.json_normalize(data, 'listOfHouses')
        print(normalized_json.to_string())
        
        flattened_json = flatten_json(data['listOfHouses'])
        print(flattened_json)

pd.json_normalize的输出:

0  House Lannister  [{'firstName': 'Tywin', 'lastName': 'Lannister'}, {'firstName': 'Cersei', 'lastName': 'Lannister'}]
1  House Targaryen                                                 [{'firstName': 'Daenerys', 'lastName': 'Targaryen'}]

树遍历的输出:

{
  '0_name': 'House Lannister',
  '0_listOfMembers_0_firstName': 'Tywin',
  '0_listOfMembers_0_lastName': 'Lannister',
  '0_listOfMembers_1_firstName': 'Cersei',
  '0_listOfMembers_1_lastName': 'Lannister',
  '1_name': 'House Targaryen',
  '1_listOfMembers_0_firstName': 'Daenerys',
  '1_listOfMembers_0_lastName': 'Targaryen'
}
i7uaboj4

i7uaboj41#

我相信下面是你正在寻找的。(只是一个嵌套循环)

data = {
    "listOfHouses": [
        {
            "name": "House Lannister",
            "listOfMembers": [
                {
                    "firstName": "Tywin",
                    "lastName": "Lannister"
                },
                {
                    "firstName": "Cersei",
                    "lastName": "Lannister"
                }
            ]
        },
        {
            "name": "House Targaryen",
            "listOfMembers": [
                {
                    "firstName": "Daenerys",
                    "lastName": "Targaryen"
                }
            ]
        }
    ]
}
output = []
for entry in data['listOfHouses']:
    for sub in entry['listOfMembers']:
        output.append({'name': entry['name'], 'firstName': sub['firstName'], 'lastName': sub['lastName']})
print(output)

输出

[{'name': 'House Lannister', 'firstName': 'Tywin', 'lastName': 'Lannister'}, {'name': 'House Lannister', 'firstName': 'Cersei', 'lastName': 'Lannister'}, {'name': 'House Targaryen', 'firstName': 'Daenerys', 'lastName': 'Targaryen'}]

相关问题