删除JSON中不包含键/值的整个对象

v8wbuo2f  于 2022-12-15  发布在  其他
关注(0)|答案(2)|浏览(108)

我尝试删除JSON文件中的整个对象,条件是它们不包含ALL键:“交易日期”、“资产描述”、“资产类型”、“类型”和“金额”键。
下面是我的JSON文件(在本例中已被剪切):

{
    "first_name": {
        "0": "Thomas",
        "1": "John",
    },
    "transactions": {
       "0": [
            {
                "transaction_date": "11/29/2022",
                "asset_description": "FireEye, Inc.",
                "asset_type": "Stock",
                "type": "Sale (Partial)",
                "amount": "$1,001 - $15,000"
            }
          ],
       "1": [
            {
                "scanned_pdf": true,
                "ptr_link": "https://efdsearch.senate.gov/search/view/paper/658E53E8-7C2C,
                "date_recieved": "01/30/2013"
            }
          ],
          
     }
}

我需要从交易和first_name中删除整个“1”数据。在原始文件中有比这两个更多的数据,因此代码需要通用于任何金额,而不是使用[0]、[1]等。我下面的代码试图在“交易”中查找不包括“scanned_pdf”的项目,“ptr_link”和“date_recieved”,然后只保存包含更新数据的JSON(我的方法有点颠倒,所以如果不包含x,它不会删除对象,而是选择不包含y的对象并更新JSON):

import json

with open("xxxtester.json", "r") as f_in:
    data = json.load(f_in)

to_delete = {"scanned_pdf", "ptr_link", "date_recieved"}

for k in data["transactions"]:
    data["transactions"][k] = [
        {kk: vv for kk, vv in d.items() if kk not in to_delete}
        for d in data["transactions"][k]]

open("xxxtester.json", "w").write(
    json.dumps(data, indent=4))

然而,我的输出仍然显示“1”,但带有空数据“{}”等。我应该使用不同的逻辑方法吗?或者可以向现有脚本添加代码使其工作。
下面是我想要的输出:

{
    "first_name": {
        "0": "Thomas",
    },
    "transactions": {
       "0": [
            {
                "transaction_date": "11/29/2022",
                "asset_description": "FireEye, Inc.",
                "asset_type": "Stock",
                "type": "Sale (Partial)",
                "amount": "$1,001 - $15,000"
            }
          ],
      }
}
qxgroojn

qxgroojn1#

如果我们反转你的逻辑(所以我们选择我们想要保留的项,而不是相反),并添加第二个解析来过滤空值,我们最终会得到:

import json

with open("xxxtester.json", "r") as f_in:
    data = json.load(f_in)

required = set(
    ("transaction_date", "asset_description", "asset_type", "type", "amount")
)

data["transactions"] = {
    k: [transaction for transaction in v if all(k in transaction for k in required)]
    for k, v in data['transactions'].items()
}

data["transactions"] = {
    k: v for k, v in data['transactions'].items() if v
}

# Update data["first_name"] so that it only contains keys that also exists
# in data["transactions"].
data["first_name"] = {k: v for k, v in data["first_name"].items() if k in data["transactions"]}

print(json.dumps(data, indent=4))

给定如下输入:

{
    "first_name": {
        "0": "Thomas",
        "1": "John"
    },
    "transactions": {
       "0": [
            {
                "transaction_date": "11/29/2022",
                "asset_description": "FireEye, Inc.",
                "asset_type": "Stock",
                "type": "Sale (Partial)",
                "amount": "$1,001 - $15,000"
            },
            {
                "scanned_pdf": true,
                "ptr_link": "https://efdsearch.senate.gov/search/view/paper/658E53E8-7C2C",
                "date_recieved": "01/30/2013"
            }
          ],
       "1": [
            {
                "scanned_pdf": true,
                "ptr_link": "https://efdsearch.senate.gov/search/view/paper/658E53E8-7C2C",
                "date_recieved": "01/30/2013"
            }
          ]
     }
}

以上代码生成:

{
    "first_name": {
        "0": "Thomas"
    },
    "transactions": {
        "0": [
            {
                "transaction_date": "11/29/2022",
                "asset_description": "FireEye, Inc.",
                "asset_type": "Stock",
                "type": "Sale (Partial)",
                "amount": "$1,001 - $15,000"
            }
        ]
    }
}

第一次字典理解...

data["transactions"] = {
    k: [transaction for transaction in v if all(k in transaction for k in required)]
    for k, v in data['transactions'].items()
}

.产生:

...
    "transactions": {
        "0": [
            {
                "transaction_date": "11/29/2022",
                "asset_description": "FireEye, Inc.",
                "asset_type": "Stock",
                "type": "Sale (Partial)",
                "amount": "$1,001 - $15,000"
            }
        ],
        "1": []
    }
...

第二种理解过滤掉值为空列表的键。
第三个解析从data["first_name]中移除不存在于data["transactions"]中的项。

eanckbw9

eanckbw92#

有了这个代码,你要删除整个事情.

import json

with open("xxxtester.json", "r") as f_in:
    data = json.load(f_in)

with open("xxxtester.json", "w") as f:
    del data["transactions"]["1"]
    json.dump(data, f)

相关问题