如何正确地“合并”复杂的python字典?

eqqqjvef  于 2022-12-25  发布在  Python
关注(0)|答案(2)|浏览(132)

我有n个非常复杂的Python字典,深度级别很大(~5),我不知道如何正确快速地合并它们,而不是在它们上面迭代百万次。
值得一提的是,正如你将在下文中看到的,口述有着严格的结构。
我尝试的解决方案与:

  • 缺席判决
  • 归并算子

Python的版本- 3.9

d1 = {
  "name": "Louis",
  "places": [
    {
      "code": "A",
      "subplaces": [
        {
          "name": "Subplace name",
          "subsubplaces": [
            {
              "name": "subsub1"
            }
          ]
        },
        {
          "name": "Subplace name2",
          "subsubplaces": [
            {
              "name": "subsub1"
            }
          ]
        }
      ]
    }
  ]
}

d2 = {
  "name": "Louis",
  "places": [
    {
      "code": "B",
      "subplaces": [
        {
          "name": "Subplace name",
          "subsubplaces": [
            {
              "name": "subsub1"
            }
          ]
        },
        {
          "name": "Subplace name2",
          "subsubplaces": [
            {
              "name": "subsub1"
            }
          ]
        }
      ]
    }
  ]
}

d3 = {
  "name": "Louis",
  "places": [
    {
      "code": "A",
      "subplaces": [
        {
          "name": "Subplace name X",
          "subsubplaces": [
            {
              "name": "subsub1"
            }
          ]
        }
      ]
    }
  ]
}

在这种情况下,输出应为

d_merged = {
  "name": "Louis",
  "places": [
    {
      "code": "A",
      "subplaces": [
        {
          "name": "Subplace name",
          "subsubplaces": [
            {
              "name": "subsub1"
            }
          ]
        },
        {
          "name": "Subplace name2",
          "subsubplaces": [
            {
              "name": "subsub1"
            }
          ]
        },
        {
          "name": "Subplace name X",
          "subsubplaces": [
            {
              "name": "subsub1"
            }
          ]
        }
      ]
    },
    {
      "code": "B",
      "subplaces": [
        {
          "name": "Subplace name",
          "subsubplaces": [
            {
              "name": "subsub1"
            }
          ]
        },
        {
          "name": "Subplace name2",
          "subsubplaces": [
            {
              "name": "subsub1"
            }
          ]
        }
      ]
    }
  ]
}
b91juud3

b91juud31#

您的任务非常具体,因此不可能有通用的解决方案。我建议您合并嵌套字典中的所有 "places""subplaces""subsubplaces",以清除所有可能的重复项,然后修改数据以匹配所需的格式。

from itertools import groupby
from operator import itemgetter
from collections import defaultdict

def merge_places(*dicts):
    if not dicts:
        return
    
    # check all dicts have same names
    # https://docs.python.org/3/library/itertools.html#itertools-recipes
    g = groupby(dicts, itemgetter("name"))
    if next(g, True) and next(g, False):
        raise ValueError("Dictionaries names are not equal")

    places = defaultdict(lambda: defaultdict(set))  # set values are unique
    for d in dicts:
        for place in d["places"]:
            for subplace in place["subplaces"]:
                for subsubplace in subplace["subsubplaces"]:
                    places[place["code"]][subplace["name"]].add(subsubplace["name"])

    return {
        "name": d["name"],  # always exists as dicts aren't empty
        "places": [
            {
                "code": code,
                "subplaces": [
                    {
                        "name": name,
                        "subsubplaces": [
                            {"name": subsubplace}
                            for subsubplace in subsubplaces
                        ]
                    }
                    for name, subsubplaces in subplaces.items()
                ]
            }
            for code, subplaces in places.items()
        ]
    }

用法:

result = merge_places(d1, d2, d3)

输出:

{
    "name": "Louis",
    "places": [
        {
            "code": "A",
            "subplaces": [
                {
                    "name": "Subplace name",
                    "subsubplaces": [
                        {
                            "name": "subsub1"
                        }
                    ]
                },
                {
                    "name": "Subplace name2",
                    "subsubplaces": [
                        {
                            "name": "subsub1"
                        }
                    ]
                },
                {
                    "name": "Subplace name X",
                    "subsubplaces": [
                        {
                            "name": "subsub1"
                        }
                    ]
                }
            ]
        },
        {
            "code": "B",
            "subplaces": [
                {
                    "name": "Subplace name",
                    "subsubplaces": [
                        {
                            "name": "subsub1"
                        }
                    ]
                },
                {
                    "name": "Subplace name2",
                    "subsubplaces": [
                        {
                            "name": "subsub1"
                        }
                    ]
                }
            ]
        }
    ]
}
11dmarpk

11dmarpk2#

我认为你的数据表示有很多非必要的细节,我们可以通过这个解决方案来减少它们:

from typing import Dict, List

dicts = [
    {
        "name": "Louis",
        "places": [
            {
                "code": "A",
                "subplaces": [
                    {
                        "name": "Subplace name",
                        "subsubplaces": [
                            {
                                "name": "subsub1"
                            }
                        ]
                    },
                    {
                        "name": "Subplace name2",
                        "subsubplaces": [
                            {
                                "name": "subsub1"
                            }
                        ]
                    }
                ]
            }
        ]
    },
    {
        "name": "Louis",
        "places": [
            {
                "code": "B",
                "subplaces": [
                    {
                        "name": "Subplace name",
                        "subsubplaces": [
                            {
                                "name": "subsub1"
                            }
                        ]
                    },
                    {
                        "name": "Subplace name2",
                        "subsubplaces": [
                            {
                                "name": "subsub1"
                            }
                        ]
                    }
                ]
            }
        ]
    },
    {
        "name": "Louis",
        "places": [
            {
                "code": "A",
                "subplaces": [
                    {
                        "name": "Subplace name X",
                        "subsubplaces": [
                            {
                                "name": "subsub1"
                            }
                        ]
                    }
                ]
            }
        ]
    }]

def merger(dicts: List[Dict]) -> Dict:
    result = {}
    for d in dicts:
        name = d["name"]
        if not name in result:
            result[name] = {}
        places = d["places"]
        for p in places:
            code = p["code"]
            if not code in result[name]:
                result[name][code] = []
            result[name][code].extend(p["subplaces"])
    return result

print(merger(dicts=dicts))

输出将为:

{
    'Louis':{
        'A':[
            {'name': 'Subplace name', 'subsubplaces': [{'name': 'subsub1'}]},
            {'name': 'Subplace name2', 'subsubplaces': [{'name': 'subsub1'}]},
            {'name': 'Subplace name X', 'subsubplaces': [{'name': 'subsub1'}]}
        ],
        'B':[
            {'name': 'Subplace name', 'subsubplaces': [{'name': 'subsub1'}]},
            {'name': 'Subplace name2', 'subsubplaces': [{'name': 'subsub1'}]}]
    }
}

如果你想要你想要的输出,很容易把这个修改成你想要的输出,但是这个更易读,更易维护。

相关问题