如何在Elasticsearch中使用嵌套类型的独占名称-值属性过滤器来计数文档?

uwopmtnx  于 2023-04-11  发布在  ElasticSearch
关注(0)|答案(1)|浏览(115)

`需要一个解决方案来计算Elasticsearch中匹配给定的一组排他性名称-值属性对的文档数量,其中属性字段是嵌套类型。输出应该分别显示满足每个属性过滤器的文档数量。如果文档满足一个过滤器但不满足另一个过滤器,则仍然应该考虑满足过滤器的数量。
Elasticsearch中的示例数据:

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 9,
      "relation": "eq"
    },
    "max_score": 1.0020497,
    "hits": [
      {
        "_index": "INDEX",
        "_type": "_doc",
        "_id": "DOC_1",
        "_score": 1.0020497,
        "_source": {
          "doc": {
            "attributes": [
              {
                "name": "NAME_1",
                "value": "VALUE_1"
              },
              {
                "name": "NAME_2",
                "value": "VALUE_2"
              },
              {
                "name": "NAME_3",
                "value": "VALUE_3"
              }
            ],
            "unique_doc_id": "DOC_1"
          }
        }
      }
        {
        "_index": "INDEX",
        "_type": "_doc",
        "_id": "DOC_2",
        "_score": 1.0020497,
        "_source": {
          "doc": {
            "attributes": [
              {
                "name": "NAME_1",
                "value": "VALUE_1"
              },
              {
                "name": "NAME_2",
                "value": "VALUE_7"
              },
              {
                "name": "NAME_4",
                "value": "VALUE_6"
              }
            ],
            "unique_doc_id": "DOC_2"
          }
        }
      }
    ]
  }
}

输入示例:

{
  "attributes": [
    {
      "name": "NAME_1",
      "value": "VALUE_1"
    },
    {
      "name": "NAME_2",
      "value": "VALUE_7"
    },
    {
      "name": "NAME_3",
      "value": "VALUE_30"
    }
  ]
}

预期输出:

[
  {
    "attribute_name": "NAME_1",
    "count": 2
  },
  {
    "attribute_name": "NAME_2",
    "count": 1
  },
  {
    "attribute_name": "NAME_3",
    "count": 0
  }
]

尝试使用无痛脚本编写脚本。主要问题是有一个全局变量,它可以存储和增加计数,每当过滤器得到满足。`

raogr8fs

raogr8fs1#

我认为你应该为你想要检查的每个条件使用一个聚合。每个聚合必须在同一级别,所以每个条件将被单独检查。这将是查询的一个例子:

GET /index/_search
{
  "size": 0,
  "aggs": {
    "Filter by NAME_1": {
      "filter": {
        "nested": {
          "path": "attributes",
          "query": {
            "bool": {
              "filter": [
                {
                  "term": {
                    "attributes.name": "NAME_1"
                  }
                },
                {
                  "term": {
                    "attributes.value": "VALUE_1"
                  }
                }
              ]
            }
          }
        }
      }
    },
    "Filter by NAME_2": {
      "filter": {
        "nested": {
          "path": "attributes",
          "query": {
            "bool": {
              "filter": [
                {
                  "term": {
                    "attributes.name": "NAME_2"
                  }
                },
                {
                  "term": {
                    "attributes.value": "VALUE_7"
                  }
                }
              ]
            }
          }
        }
      }
    },
    "Filter by NAME_3": {
      "filter": {
        "nested": {
          "path": "attributes",
          "query": {
            "bool": {
              "filter": [
                {
                  "term": {
                    "attributes.name": "NAME_3"
                  }
                },
                {
                  "term": {
                    "attributes.value": "VALUE_30"
                  }
                }
              ]
            }
          }
        }
      }
    }
  }
}

输出结果如下所示:

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "Filter by NAME_1" : {
      "doc_count" : 2
    },
    "Filter by NAME_2" : {
      "doc_count" : 1
    },
    "Filter by NAME_3" : {
      "doc_count" : 0
    }
  }
}

相关问题