Pandas Dataframe :如何将列表中的嵌套词典展开为新行

oo7oh9g9  于 2022-11-20  发布在  其他
关注(0)|答案(3)|浏览(168)

我正在尝试扁平化API响应。这是响应

data = [{
            "id": 1,
            "status": "Public",
            "Options": [
                  {
                        "id": 8,
                        "pId": 9
                  },
                  {
                      "id": 10,
                        "pId": 11
                  }
               ]
},      
        {
            "id": 2,
            "status": "Public",
            "Options": [
                  {
                        "id": 12,
                        "pId": 13
                  },
                  {
                      "id": 14,
                        "pId": 15
                  }
               ]
}

]

我正在尝试这样做(应用ast literal eval,df.pop和json normalize)。

def pop(child_df, column_value):

    child_df = child_df.dropna(subset=[column_value])
    if isinstance(child_df[column_value][0], str):
        print("yes")
        child_df[column_value] = child_df[column_value].apply(ast.literal_eval)
    normalized_json = [json_normalize(x) for x in child_df.pop(column_value)]
    expanded_child_df = child_df.join(pd.concat(normalized_json, ignore_index=True, sort=False).add_prefix(column_value + '_'))
    expanded_child_df.columns = [str(col).replace('\r','') for col in expanded_child_df.columns]
    expanded_child_df.columns = map(str.lower, expanded_child_df.columns)

    return expanded_child_df

df = pd.DataFrame.from_dict(data)

df2 = pop(df,'Options')

这是我得到的输出

id  status  options_id  options_pid
0   1  Public           8            9
1   2  Public          10           11

但是代码跳过了Options列表中的一些值。

id  status  options_id  options_pid
0   1  Public           8            9
1   1  Public           10           11
2   2  Public          12           13
3   2  Public          14           15

我错过了什么?

xurqigkl

xurqigkl1#

您可以用途:

df=pd.json_normalize(data).explode('Options')
df=df.join(df['Options'].apply(pd.Series).add_prefix('options_')).drop(['Options'],axis=1).drop_duplicates()
print(df)
'''
   id  status  optionsid  optionspId
0   1  Public          8           9
0   1  Public         10          11
1   2  Public         12          13
1   2  Public         14          15
'''
nimxete2

nimxete22#

df = pd.json_normalize(data, record_path="Options", meta=['id','status'], record_prefix='options.')
x9ybnkn6

x9ybnkn63#

df = pd.json_normalize(data).explode('Options')
tmp= df['Options'].apply(pd.Series)
df = pd.concat([df[['id', 'status']], tmp], axis=1)
print(df)

相关问题