为什么在使用pandas规范化json时达到嵌套Meta会给出NaN?

kse8i1jr  于 2023-08-01  发布在  其他
关注(0)|答案(1)|浏览(92)

我的输入是一个Python字典(json-like):

d = {
    "type": "type1",
    "details": {
        "name": "foo",
        "date": {
            "timestamp": "01/02/2023 21:42:44",
            "components": {
                "day": 2,
                "month": 1,
                "year": 2023,
                "time": "21:42:44"
            }
        }
    },
    "infos": {
        "records": [
            {
                "field1": "qux",
                "field2": "baz",
            }
        ],
        "class": "P"
    }
}

字符串
我使用下面的代码:

df = pd.json_normalize(
    d,
    record_path=["infos", "records"],
    meta=[
        "type",
        ["details", "date", "timestamp"],
        ["details", "date", "components", "year"],
        ["infos", "class"]
    ],
    errors="ignore"
)


这给了我这样的输出:

field1 field2   type details.date.timestamp details.date.components.year infos.class
0    qux    baz  type1                    NaN                          NaN           P


但我期待的是这个:

field1 field2   type details.date.timestamp details.date.components.year infos.class
0    qux    baz  type1    01/02/2023 21:42:44                         2023           P


老实说,我对meta参数快疯了!我忽略我做错了什么..
你能解释一下它的逻辑吗?

ogq8wdun

ogq8wdun1#

我认为你应该在record_path=中添加额外的[]

df = pd.json_normalize(
    d,
    record_path=[["infos", "records"]],  # <-- put [] here
    meta=[
        "type",
        ["details", "date", "timestamp"],
        ["details", "date", "components", "year"],
        ["infos", "class"],
    ],
    errors="ignore",
)

print(df)

字符串
图纸:

field1 field2   type details.date.timestamp details.date.components.year infos.class
0    qux    baz  type1    01/02/2023 21:42:44                         2023           P

相关问题