pandas 双层嵌套数组Json Panda

qxsslcnc  于 12个月前  发布在  其他
关注(0)|答案(1)|浏览(99)

我有一个非常简单的JSON文件,我需要它的特定信息,以便我进一步处理它。我花了一个周末的时间试图弄清楚我做错了什么,但希望你们能帮助我。
下面是一个有问题的JSON的例子,下面是我的代码

{
    "resultsPerPage": 1,
    "startIndex": 0,
    "totalResults": 1,
    "format": "NVD_CVE",
    "version": "2.0",
    "timestamp": "2023-09-08T18:15:53.110",
    "vulnerabilities": [
        {
            "cve": {
                "id": "CVE-2022-34731",
                "sourceIdentifier": "[email protected]",
                "published": "2022-09-13T19:15:10.757",
                "lastModified": "2023-04-11T21:15:11.870",
                "vulnStatus": "Modified",
                "descriptions": [
                    {
                        "lang": "en",
                        "value": "Microsoft WDAC OLE DB provider for SQL Server Remote Code Execution Vulnerability"
                    },
                    {
                        "lang": "es",
                        "value": "Una vulnerabilidad de Ejecución de Código Remota de Microsoft OLE DB Provider for SQL Server. Este ID de CVE es diferente de CVE-2022-34733, CVE-2022-35834, CVE-2022-35835, CVE-2022-35836, CVE-2022-35840"
                    }
                ],
                "metrics": {
                    "cvssMetricV31": [
                        {
                            "source": "[email protected]",
                            "type": "Primary",
                            "cvssData": {
                                "version": "3.1",
                                "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H",
                                "attackVector": "NETWORK",
                                "attackComplexity": "LOW",
                                "privilegesRequired": "NONE",
                                "userInteraction": "REQUIRED",
                                "scope": "UNCHANGED",
                                "confidentialityImpact": "HIGH",
                                "integrityImpact": "HIGH",
                                "availabilityImpact": "HIGH",
                                "baseScore": 8.8,
                                "baseSeverity": "HIGH"
                            },
                            "exploitabilityScore": 2.8,
                            "impactScore": 5.9
                        }
                    ]
                },
                "weaknesses": [
                    {
                        "source": "[email protected]",
                        "type": "Primary",
                        "description": [
                            {
                                "lang": "en",
                                "value": "NVD-CWE-noinfo"
                            }
                        ]
                    }
                ],
                "configurations": [
                    {
                        "nodes": [
                            {
                                "operator": "OR",
                                "negate": false,
                                "cpeMatch": [
                                    {
                                        "vulnerable": true,
                                        "criteria": "cpe:2.3:o:microsoft:windows_10:-:*:*:*:*:*:*:*",
                                        "matchCriteriaId": "21540673-614A-4D40-8BD7-3F07723803B0"
                                    },
                                    {
                                        "vulnerable": true,
                                        "criteria": "cpe:2.3:o:microsoft:windows_10:20h2:*:*:*:*:*:*:*",
                                        "matchCriteriaId": "9E2C378B-1507-4C81-82F6-9F599616845A"
                                    },
                                    {
                                        "vulnerable": true,
                                        "criteria": "cpe:2.3:o:microsoft:windows_10:21h1:*:*:*:*:*:*:*",
                                        "matchCriteriaId": "FAE4278F-71A7-43E9-8F79-1CBFAE71D730"
                                    },
                                    {
                                        "vulnerable": true,
                                        "criteria": "cpe:2.3:o:microsoft:windows_10:21h2:*:*:*:*:*:*:*",
                                        "matchCriteriaId": "71E65CB9-6DC2-4A90-8C6A-103BEDC99823"
                                    },
                                    {
                                        "vulnerable": true,
                                        "criteria": "cpe:2.3:o:microsoft:windows_10:1607:*:*:*:*:*:*:*",
                                        "matchCriteriaId": "E01A4CCA-4C43-46E0-90E6-3E4DBFBACD64"
                                    },
                                    {
                                        "vulnerable": true,
                                        "criteria": "cpe:2.3:o:microsoft:windows_10:1809:*:*:*:*:*:*:*",
                                        "matchCriteriaId": "6B8F3DD2-A145-4AF1-8545-CC42892DA3D1"
                                    },
                                    {
                                        "vulnerable": true,
                                        "criteria": "cpe:2.3:o:microsoft:windows_11:-:*:*:*:*:*:arm64:*",
                                        "matchCriteriaId": "B9F64296-66BF-4F1D-A11C-0C44C347E2AC"
                                    },
                                    {
                                        "vulnerable": true,
                                        "criteria": "cpe:2.3:o:microsoft:windows_11:-:*:*:*:*:*:x64:*",
                                        "matchCriteriaId": "5D7F7DDB-440E-42CD-82F4-B2C13F3CC462"
                                    },
                                    {
                                        "vulnerable": true,
                                        "criteria": "cpe:2.3:o:microsoft:windows_7:-:sp1:*:*:*:*:*:*",
                                        "matchCriteriaId": "C2B1C231-DE19-4B8F-A4AA-5B3A65276E46"
                                    },
                                    {
                                        "vulnerable": true,
                                        "criteria": "cpe:2.3:o:microsoft:windows_8.1:-:*:*:*:*:*:*:*",
                                        "matchCriteriaId": "E93068DB-549B-45AB-8E5C-00EB5D8B5CF8"
                                    },
                                    {
                                        "vulnerable": true,
                                        "criteria": "cpe:2.3:o:microsoft:windows_rt_8.1:-:*:*:*:*:*:*:*",
                                        "matchCriteriaId": "C6CE5198-C498-4672-AF4C-77AB4BE06C5C"
                                    },
                                    {
                                        "vulnerable": true,
                                        "criteria": "cpe:2.3:o:microsoft:windows_server_2008:-:sp2:*:*:*:*:*:*",
                                        "matchCriteriaId": "5F422A8C-2C4E-42C8-B420-E0728037E15C"
                                    },
                                    {
                                        "vulnerable": true,
                                        "criteria": "cpe:2.3:o:microsoft:windows_server_2008:r2:sp1:*:*:*:*:x64:*",
                                        "matchCriteriaId": "AF07A81D-12E5-4B1D-BFF9-C8D08C32FF4F"
                                    },
                                    {
                                        "vulnerable": true,
                                        "criteria": "cpe:2.3:o:microsoft:windows_server_2012:-:*:*:*:*:*:*:*",
                                        "matchCriteriaId": "A7DF96F8-BA6A-4780-9CA3-F719B3F81074"
                                    },
                                    {
                                        "vulnerable": true,
                                        "criteria": "cpe:2.3:o:microsoft:windows_server_2012:r2:*:*:*:*:*:*:*",
                                        "matchCriteriaId": "DB18C4CE-5917-401E-ACF7-2747084FD36E"
                                    },
                                    {
                                        "vulnerable": true,
                                        "criteria": "cpe:2.3:o:microsoft:windows_server_2016:-:*:*:*:*:*:*:*",
                                        "matchCriteriaId": "041FF8BA-0B12-4A1F-B4BF-9C4F33B7C1E7"
                                    },
                                    {
                                        "vulnerable": true,
                                        "criteria": "cpe:2.3:o:microsoft:windows_server_2019:-:*:*:*:*:*:*:*",
                                        "matchCriteriaId": "DB79EE26-FC32-417D-A49C-A1A63165A968"
                                    },
                                    {
                                        "vulnerable": true,
                                        "criteria": "cpe:2.3:o:microsoft:windows_server_2022:-:*:*:*:*:*:*:*",
                                        "matchCriteriaId": "821614DD-37DD-44E2-A8A4-FE8D23A33C3C"
                                    },
                                    {
                                        "vulnerable": true,
                                        "criteria": "cpe:2.3:o:microsoft:windows_server_2022:-:*:*:*:azure:*:*:*",
                                        "matchCriteriaId": "036D57EB-3226-438A-B3E6-0D4698D7EDCD"
                                    }
                                ]
                            }
                        ]
                    }
                ],
                "references": [
                    {
                        "url": "https://msrc.microsoft.com/update-guide/vulnerability/CVE-2022-34731",
                        "source": "[email protected]"
                    }
                ]
            }
        }
    ]
}
response = requests.request("GET", url, headers=headers, data={}).json
nistjson  = response.json()
df=pd.json_normalize(nistjson, record_path=['vulnerabilities'], meta=['descriptions', 'references'])
print(df)

当我尝试这样做时,我得到以下错误代码

Traceback (most recent call last):
  File "C:\Users\Sinpo\Desktop\Python Projects\Raw API Calls Python\Raw API Call Python Testing\Raw API Call Python Testing\Raw_API_Call_Python_Testing.py", line 18, in <module>
    data=pd.json_normalize(nistjson, record_path=['cve', ['descriptions','references']])
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Sinpo\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\io\json\_normalize.py", line 519, in json_normalize
    _recursive_extract(data, record_path, {}, level=0)
  File "C:\Users\Sinpo\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\io\json\_normalize.py", line 498, in _recursive_extract
    _recursive_extract(obj[path[0]], path[1:], seen_meta, level=level + 1)
                       ~~~^^^^^^^^^
KeyError: 'cve'
Press any key to continue . . .

我需要Description中的值,在第一个对象中,在这种情况下是“value”:“Microsoft WDAC OLE DB Provider for SQL Server Remote Code Execution Vulnerability”沿着“url”:“https://msrc.microsoft.com/update-guide/vulnerability/CVE-2022-34731”在参考文献的底部。
我喜欢的格式看起来像这样。
value:Microsoft WDAC OLE DB Provider for SQL Server远程代码执行漏洞
网址:https://msrc.microsoft.com/update-guide/vulnerability/CVE-2022-34731
编辑1:
在遵循Andrej Kesely给出的建议之后,我的代码现在看起来像这样,并且它确实按预期工作。

response = requests.request("GET", url, headers=headers, data={})
nistjson  = response.json()
df = []
for v in nistjson["vulnerabilities"]:
    value = next(d["value"] for d in v["cve"]["descriptions"] if d["lang"] == "en")
    refs = [r["url"] for r in v["cve"]["references"]]
    df.append((value, refs))

df = pd.DataFrame(df, columns=["value", "url"]).explode("url")
print(df)
bwitn5fc

bwitn5fc1#

您可以使用json模块解析Json文件,然后手动构造 Dataframe 。举例来说:

import json

import pandas as pd

with open("data.json", "r") as f_in:
    data = json.load(f_in)

df = []
for v in data["vulnerabilities"]:
    value = next(d["value"] for d in v["cve"]["descriptions"] if d["lang"] == "en")
    refs = [r["url"] for r in v["cve"]["references"]]
    df.append((value, refs))

df = pd.DataFrame(df, columns=["value", "url"]).explode("url")
print(df)

图纸:

value                                                                   url
0  Microsoft WDAC OLE DB provider for SQL Server Remote Code Execution Vulnerability  https://msrc.microsoft.com/update-guide/vulnerability/CVE-2022-34731

编辑:如何从URL读取数据:

import pandas as pd
import requests

url = "https://services.nvd.nist.gov/rest/json/cves/2.0"
data = requests.get(url).json()

df = []
for v in data["vulnerabilities"]:
    value = next(d["value"] for d in v["cve"]["descriptions"] if d["lang"] == "en")
    refs = [r["url"] for r in v["cve"]["references"]]
    df.append((value, refs))

df = pd.DataFrame(df, columns=["value", "url"]).explode("url")
print(df)

图纸:
| 值|URL|
| --|--|
| Sendmail中的debug命令已启用,允许攻击者以root身份执行命令。|http://seclists.org/fulldisclosure/2019/Jun/16|
| Sendmail中的debug命令已启用,允许攻击者以root身份执行命令。|http://www.openwall.com/lists/oss-security/2019/06/05/4|
| Sendmail中的debug命令已启用,允许攻击者以root身份执行命令。|http://www.openwall.com/lists/oss-security/2019/06/06/1|
| Sendmail中的debug命令已启用,允许攻击者以root身份执行命令。|http://www.osvdb.org/195|
等等

相关问题