如何使用pd normalize展平嵌套的json

ws51t4hk  于 2021-07-13  发布在  ElasticSearch
关注(0)|答案(1)|浏览(301)

我使用es进行聚合,结果如下:

{'took': 27,
 'timed_out': False,
 '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0},
 'hits': {'total': {'value': 233, 'relation': 'eq'},
  'max_score': None,
  'hits': []},
 'aggregations': {'sales_over_time': {'buckets': [{'key': 1617235200000,
     'doc_count': 9,
     'name': {'doc_count_error_upper_bound': 0,
      'sum_other_doc_count': 0,
      'buckets': [{'key': '624232499',
        'doc_count': 4,
        'latest_comment': {'hits': {'total': {'value': 4, 'relation': 'eq'},
          'max_score': None,
          'hits': [{'_index': 'data',
            '_type': 'test',
            '_id': 'Hb5Uj3gBm2iwycZfdDvr',
            '_score': None,
            '_source': {'totalsales': 2630149, 'Id': '624232499'},
            'sort': [1617312374760]}]}}},
       {'key': '624232532',
        'doc_count': 4,
        'latest_comment': {'hits': {'total': {'value': 4, 'relation': 'eq'},
          'max_score': None,
          'hits': [{'_index': 'data',
            '_type': 'test',
            '_id': 'q77NjngBm2iwycZf6hdU',
            '_score': None,
            '_source': {'sales': 5810, 'Id': '624232532'},
            'sort': [1617303556611]}]}}},
       {'key': '656625970',
        'doc_count': 1,
        'latest_comment': {'hits': {'total': {'value': 1, 'relation': 'eq'},
          'max_score': None,
          'hits': [{'_index': 'data',
            '_type': 'test',
            '_id': 'Nb4xj3gBm2iwycZfFjKH',
            '_score': None,
            '_source': {'totalsales': 12690, 'Id': '656625970'},
            'sort': [1617310056788]}]}}}]}},

我试着得到结果并使用pd.normalize,比如test\u json=pd.json\u normalize(result['aggregations']['sales\u over\u time']['bucket']),结果出来了

key doc_count   name.doc_count_error_upper_bound    name.sum_other_doc_count    name.buckets
0   1617235200000   9   0   0   [{'key': '624232499', 'doc_count': 4, 'latest_...
1   1617321600000   9   0   0   [{'key': '624232499', 'doc_count': 4, 'latest_.

所以我尝试使用
pd.json\u normalize(result['aggregations']['sales\u over\u time']['bucket'])。explode(“name.bucket”)。to \u dict(orient=“records”))
it新闻有一个嵌套层“name.bucket.latest\u comment.hits.hits”

ey  doc_count   name.doc_count_error_upper_bound    name.sum_other_doc_count    name.buckets.key    name.buckets.doc_count  name.buckets.latest_comment.hits.total.value    name.buckets.latest_comment.hits.total.relation name.buckets.latest_comment.hits.max_score  name.buckets.latest_comment.hits.hits
0   1617235200000   9   0   0   624232499   4   4   eq  None    [{'_index': 'data', '_type': 'test', '_...
1   1617235200000   9   0   0   624232532   4   4   eq  None    [{'_index': 'data', '_type': 'test', '_...

我怎样才能奉承所有嵌套的json呢?

ukxgm1gy

ukxgm1gy1#

js 你的样本是json吗
四通 json_normalize() 是吗 explode() 嵌入式列表, reset_index() 必要时

pd.json_normalize(pd.json_normalize(pd.json_normalize(pd.json_normalize(js)
                   .explode("aggregations.sales_over_time.buckets")
                   .to_dict(orient="records"))
 .explode("aggregations.sales_over_time.buckets.name.buckets")
 .reset_index(drop=True)
 .to_dict(orient="records"))
 .explode("aggregations.sales_over_time.buckets.name.buckets.latest_comment.hits.hits")
 .to_dict(orient="records")
)

相关问题