基于多个匹配项的字段ElasticSearch筛选器,

qnakjoqk  于 2022-11-02  发布在  ElasticSearch
关注(0)|答案(2)|浏览(170)

I want to filter a list of employees based on programming language skills like C, C++, Java etc. I am using Elasticsearch DSL in Java to search based onall of the terms.
termsQuery returns data matchingany of the terms: means if at least one terms matches, it selects the data

I tried to the following code to set minimum_should_match to tags.length to match all given tags as "ANDoperator" to filter data but failed.

QueryBuilder query = QueryBuilders
            .boolQuery()
            .must(
                    QueryBuilders
                            .termsQuery("tags",tags)
            )
            .minimumShouldMatch(tags.length);

I also tried to use TermsSetQueryBuilder to check list of terms but it throws exception : minimum_should_match_field not set

QueryBuilder query =
            QueryBuilders
                    .boolQuery()
                    .should(
                            new TermsSetQueryBuilder("tags", tags)
                    )
                    .minimumShouldMatch(tags.size());

Also, tried to set minimum_should_match_field in TermsSetQuery , but it only acceptsString, not numeric value or percentage as mention here. Tried to set like minimum_should_match_field = "2" minimum_should_match_field = "100%" even tried to setMinimumShouldMatchScript . Not working.

QueryBuilder query =
            QueryBuilders
                    .boolQuery()
                    .should(
                            new TermsSetQueryBuilder("tags", tags)
                                    .setMinimumShouldMatchField(tags.size())
                    )
                    .minimumShouldMatch(tags.size());

How can I filter for a field("tags") based on several terms("tags": ["JAVA", "C"]) matching on all of them?

UPDATEDMy code looks like the following:

public List<Employee> getEmployeesByFilters(List<String> terms) {

    int required_matches = terms.size();
    QueryBuilder query =
            QueryBuilders
                    .boolQuery()
                    .must(
                            new TermsSetQueryBuilder(
                                    "filters", terms)
                                    .setMinimumShouldMatchField("required_matches")
                    );

    NativeSearchQuery nativeSearchQuery = new NativeSearchQueryBuilder()
            .withQuery(query)
            .build();

    List<Employee> employees = elasticsearchRestTemplate
    .search(nativeSearchQuery, Employee.class)
    .stream().map(SearchHit -> SearchHit.getContent())
    .collect(Collectors.toList());

    return employees;
}
sauutmhj

sauutmhj1#

您可以简单地使用布尔查询来获得预期的结果:

{   
    "query":{
        "bool" : {
            "must" : [
               {"term" : { "tags" : "JAVA" }},
               {"term" : { "tags" : "C" }}
             ]
          }
       }
    }
}

术语集查询不起作用,因为您需要添加一个类似于required_matches的字段,并设置在对文档编制索引时返回文档所需的匹配术语数。
因此,您的索引文档将如下所示:

{
  "name": "Jane Smith",
  "tags": [ "C", "JAVA" ],
  "required_matches": 2
}

您的查询将如下所示:

{
  "query": {
    "terms_set": {
      "tags": {
        "terms": [ "JAVA", "C" ],
        "minimum_should_match_field": "required_matches"
      }
    }
  }
}

我希望您能够根据这个查询创建Java代码。

更新日期:

QueryBuilder query = QueryBuilders.boolQuery()
                .must(new TermsSetQueryBuilder("tags", tags).setMinimumShouldMatchField("required_matches"));

更新2:

您可以使用minimum_should_match_script并提供params.num_terms作为脚本,该脚本将计算您在查询中给定的术语数,并匹配所有术语并返回结果。
ElasticSearch查询:

{
  "query": {
    "terms_set": {
      "tags": {
        "terms": [
          "JAVA",
          "C"
        ],
        "minimum_should_match_script": {
          "source": "params.num_terms"
        },
        "boost": 1
      }
    }
  }
}

Java程式码:

Map<String, Object> param = new HashMap<String, Object>();
        Script script = new Script(ScriptType.INLINE, "painless", "params.num_terms", param);

        QueryBuilder query = QueryBuilders.boolQuery()
                .must(new TermsSetQueryBuilder("tags", tags).setMinimumShouldMatchScript(script));
2ic8powd

2ic8powd2#

作为参考,我的Java代码(工作版本)如下所示:

// Filtering data by a field("tags") based on several terms - matching on all of the terms
public List<Employee> getEmployeesByTags(List<String> tags) {
    // tags : string list of terms

    Map<String, Object> param = new HashMap<String, Object>();
    param.put("num_terms", tags.size());
    Script script = new Script(ScriptType.INLINE, "painless", "params.num_terms", param);

    QueryBuilder query =
            QueryBuilders
                    .boolQuery()
                    .must(
                            new TermsSetQueryBuilder(
                                    "tags", tags)
                                    .setMinimumShouldMatchScript(script)
                    );

    NativeSearchQuery nativeSearchQuery = new NativeSearchQueryBuilder()
            .withQuery(query)
            .build();

    List<Employee> employees = elasticsearchRestTemplate
    .search(nativeSearchQuery, Employee.class)
    .stream().map(SearchHit -> SearchHit.getContent())
    .collect(Collectors.toList());

    return employees;
}

相关问题