Elasticsearch -计算每个文档每个字段的出现次数

5lhxktic  于 2023-01-25  发布在  ElasticSearch
关注(0)|答案(1)|浏览(228)

是否可以计算列表字段中不同值的出现次数。
例如,假设以下数据:

[
    {
      "page":1,
      "colors":[
        {
         "color": red
        },
        {
         "color": white
         },
         {
          "color": red
         }
        ]
    },
    {
      "page":2,
      "colors":[
        {
         "color": yellow
        },
         {
          "color": yellow
         }
        ]
    }
  ]

是否可能得到如下结果:

{
      "page":1,
      "colors_count":[
        {
         "Key": red,
          "Count": 2
        },
        {
         "Key": white,
          "Count": 1
        },
        ]
    },
    {
      "page":2,
      "colors_count":[
        {
         "Key": yellow,
          "Count": 2
        }
        ]
    }

我尝试使用术语聚合,但得到的是不同值的数量,因此对于page:1,得到的是red:1和白色:1。

jaql4c8m

jaql4c8m1#

是的,您可以这样做。您必须使用nested_field类型和nested_Agg
Map:

PUT colors
{
  "mappings": {
    "properties": {
      "page" : { "type": "keyword" },
      "colors": { 
        "type": "nested",
        "properties": {
          "color": {
            "type": "keyword"
          }
        }
      }
    }
  }
}

插入文件:

PUT colors/_doc/1
{
  "page": 1,
  "colors": [
    {
      "color": "red"
    },
    {
      "color": "white"
    },
    {
      "color": "red"
    }
  ]
}

PUT colors/_doc/2
{
  "page": 2,
  "colors": [
    {
      "color": "yellow"
    },
    {
      "color": "yellow"
    }
  ]
}

质询:

GET colors/_search 
{
  "size" :0,
  "aggs": {
    "groupByPage": {
      "terms": {
        "field": "page"
      },
      "aggs": {
        "colors": {
          "nested": {
            "path": "colors"
          },
          "aggs": {
            "genres": {
              "terms": {
                "field": "colors.color"
              }
            }
          }
        }
      }
    }
  }
}

输出:

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 2,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "groupByPage": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "1", // page field value
          "doc_count": 1,
          "colors": {
            "doc_count": 3,
            "genres": {
              "doc_count_error_upper_bound": 0,
              "sum_other_doc_count": 0,
              "buckets": [
                {
                  "key": "red",
                  "doc_count": 2
                },
                {
                  "key": "white",
                  "doc_count": 1
                }
              ]
            }
          }
        },
        {
          "key": "2", // page field value
          "doc_count": 1,
          "colors": {
            "doc_count": 2,
            "genres": {
              "doc_count_error_upper_bound": 0,
              "sum_other_doc_count": 0,
              "buckets": [
                {
                  "key": "yellow",
                  "doc_count": 2
                }
              ]
            }
          }
        }
      ]
    }
  }
}

相关问题