ElasticSearch巢,带模糊的边n克

dgsult0t  于 2021-06-10  发布在  ElasticSearch
关注(0)|答案(1)|浏览(309)

我正在使用elastisearch.net和nestv7.10.0,我有这些用于ElasticSearch的设置和Map。

{
    "settings": {
        "index": {
            "analysis": {
                "filter": {},
                "analyzer": {
                    "keyword_analyzer": {
                        "filter": [
                            "lowercase",
                            "asciifolding",
                            "trim"
                        ],
                        "char_filter": [],
                        "type": "custom",
                        "tokenizer": "keyword"
                    },
                    "edge_ngram_analyzer": {
                        "filter": [
                            "lowercase"
                        ],
                        "tokenizer": "edge_ngram_tokenizer"
                    },
                    "edge_ngram_search_analyzer": {
                        "tokenizer": "lowercase"
                    }
                },
                "tokenizer": {
                    "edge_ngram_tokenizer": {
                        "type": "edge_ngram",
                        "min_gram": 2,
                        "max_gram": 50,
                        "token_chars": [
                            "letter"
                        ]
                    }
                }
            }
        }
    },
    "mappings": {
        "properties": {
            "MatchName": {
                "type": "text",
                "fields": {
                    "keywordstring": {
                        "type": "text",
                        "analyzer": "keyword_analyzer"
                    },
                    "edgengram": {
                        "type": "text",
                        "analyzer": "edge_ngram_analyzer",
                        "search_analyzer": "edge_ngram_search_analyzer"
                    },
                    "completion": {
                        "type": "completion"
                    }
                },
                "analyzer": "standard"
            },
            "CompetitionName": {
                "type": "text",
                "fields": {
                    "keywordstring": {
                        "type": "text",
                        "analyzer": "keyword_analyzer"
                    },
                    "edgengram": {
                        "type": "text",
                        "analyzer": "edge_ngram_analyzer",
                        "search_analyzer": "edge_ngram_search_analyzer"
                    },
                    "completion": {
                        "type": "completion"
                    }
                },
                "analyzer": "standard"
            }
        }
    }
}

我已经索引了3个有值的文档

{
    "_source": {
        "CompetitionName": "Premiership",
        "MatchName": "Dundee Utd - St Johnstone",
    }
},
{
    "_source": {
        "CompetitionName": "2nd Div, Vastra Gotaland UOF",
        "MatchName": "IF Limhamn Bunkeflo - FC Rosengaard 1917",
    }
},
{
    "_source": {
        "CompetitionName": "Bundesliga",
        "MatchName": "Hertha Berlin - Eintracht Frankfurt",
    }
}

我用fuziness.auto在两个字段中搜索字符串“bunde”。我想通过上面的搜索实现获取所有文档。但对于下面的问题,我什么也得不到。

string value = "bunde";
BoolQuery boolQuery = new BoolQuery
{
    Should = new List<QueryContainer>
    {
        new QueryContainer(new FuzzyQuery
        {
            Field = Infer.Field<EventHistoryDoc>(path:eventHistoryDoc => eventHistoryDoc.MatchName),
            Value = value,
            Fuzziness = Fuzziness.Auto,
        }),
        new QueryContainer(new FuzzyQuery
        {
            Field = Infer.Field<EventHistoryDoc>(path:eventHistoryDoc => eventHistoryDoc.CompetitionName),
            Value = value,
            Fuzziness = Fuzziness = Fuzziness.Auto,
        })
    }
};

ISearchRequest searchRequest = new SearchRequest
{
    Query = new QueryContainer(boolQuery),
};

var json = _elasticClient.RequestResponseSerializer.SerializeToString(searchRequest);

ISearchResponse<EventHistoryDoc> searchResponse = await _elasticClient.SearchAsync<EventHistoryDoc>(searchRequest);

如果我用字符串“bundes”搜索,我只得到一个文档

{
    "_source": {
        "CompetitionName": "Bundesliga",
        "MatchName": "Hertha Berlin - Eintracht Frankfurt",
    }
}

为了得到上面所有文档的响应,我应该对设置、Map或查询进行哪些更改?

x8diyxa7

x8diyxa71#

我不知道elasticsearch nest的语法,但在json格式中,您可以通过以下方式获得结果:
添加一个索引Map、搜索查询和搜索结果的工作示例(目前,我已经删除了 keyword_analyzer 以及 edge_ngram_search_analyzer 从索引Map,因为您只想返回所有带有边缘ngram和模糊性的文档)
索引Map:

{
    "settings": {
        "analysis": {
            "analyzer": {
                "my_analyzer": {
                    "tokenizer": "my_tokenizer"
                }
            },
            "tokenizer": {
                "my_tokenizer": {
                    "type": "edge_ngram",
                    "min_gram": 2,
                    "max_gram": 50,
                    "token_chars": [
                        "letter",
                        "digit"
                    ]
                }
            }
        },
        "max_ngram_diff": 50
    },
    "mappings": {
        "properties": {
            "CompetitionName": {
                "type": "text",
                "analyzer": "my_analyzer"
            },
            "MatchName": {
                "type": "text",
                "analyzer": "my_analyzer"
            }
        }
    }
}

搜索查询:

{
  "query": {
    "multi_match": {
      "query": "bunde",
      "fuzziness": "AUTO"
    }
  }
}

搜索结果:

"hits": [
      {
        "_index": "64968421",
        "_type": "_doc",
        "_id": "1",
        "_score": 2.483365,
        "_source": {
          "CompetitionName": "Premiership",
          "MatchName": "Dundee Utd - St Johnstone"
        }
      },
      {
        "_index": "64968421",
        "_type": "_doc",
        "_id": "3",
        "_score": 2.4444416,
        "_source": {
          "CompetitionName": "Bundesliga",
          "MatchName": "Hertha Berlin - Eintracht Frankfurt"
        }
      },
      {
        "_index": "64968421",
        "_type": "_doc",
        "_id": "2",
        "_score": 0.6104546,
        "_source": {
          "CompetitionName": "2nd Div, Vastra Gotaland UOF",
          "MatchName": "IF Limhamn Bunkeflo - FC Rosengaard 1917"
        }
      }
    ]

问题中提供的索引Map也是正确的。当使用相同的索引Map(如问题中提供的)并搜索 bunde 在multi-match查询中(如上所示),将返回所有三个文档(这是预期的结果)。

相关问题