python-3.x 如何在列表上使用循环函数创建数据框架

bnl4lu3b  于 2023-08-08  发布在  Python
关注(0)|答案(1)|浏览(101)

我想在一个由字典和另一个列表组成的列表上使用循环函数创建一个dataFrame。

list = [{'sitemap': [{'path': 'http://test.com',
    'errors': '0',
    'contents': [{'type': 'web', 'submitted': '34801', 'indexed': '4656'}]}]},
 {'sitemap': [{'path': 'https://example.com',
    'errors': '0',
    'contents': [{'type': 'web', 'submitted': '2329'}]}]}]

字符串
最初,这是我尝试的:

data_for_df = []

for each in list:
    temp = []
    temp.append(each['sitemap'][0]['path'])
    temp.append(each['sitemap'][0]['errors'])
    temp.append(each['sitemap'][0]['contents'][0]['type'])
    temp.append(each['sitemap'][0]['contents'][0]['submitted'])
    temp.append(each['sitemap'][0]['contents'][0]['indexed'])
    data_for_df.append(temp)

df = pd.DataFrame(data_for_df, columns =['path','lastSubmitted','type','submitted'])


但是,我发现这个查询返回错误,因为有时会缺少key:values。在此示例中,缺少“indexed”的key:value对。当发生这种情况时,我想返回空值或替换为空值。有人能帮我吗?

ma8fv8wu

ma8fv8wu1#

也许在这种情况下使用pd.json_normalize就足够了:

lst = [
    {
        "sitemap": [
            {
                "path": "http://test.com",
                "errors": "0",
                "contents": [{"type": "web", "submitted": "34801", "indexed": "4656"}],
            }
        ]
    },
    {
        "sitemap": [
            {
                "path": "https://example.com",
                "errors": "0",
                "contents": [{"type": "web", "submitted": "2329"}],
            }
        ]
    },
]

df = pd.json_normalize(lst, ['sitemap', ['contents']], [['sitemap', 'path'], ['sitemap', 'errors']])
print(df)

字符串
印刷品:

type submitted indexed         sitemap.path sitemap.errors
0  web     34801    4656      http://test.com              0
1  web      2329     NaN  https://example.com              0

相关问题