来自elastic的搜索结果不一致

gdx19jrr  于 2021-06-14  发布在  ElasticSearch
关注(0)|答案(1)|浏览(351)

我有一个由大约40000个船名组成的索引
当发布一个关于船只名称的查询,即“tuc”时,当我将查询项减少到“t”时,会得到许多结果。但是,我从“tuc”查询得到的结果不在结果集中?
我有点不知道是什么原因造成的,但不知道是不是因为总的结果集太大而被删掉了?
一些统计数据:
查询:

{
"query" : {
    "bool" : {
        "must" : [
            {
                "query_string" : {
                    "fields" : ["vesselName"],
                    "type" : "phrase_prefix",
                    "query" : "T"
                }
            }
        ]
    }
}

结果(第一个):

"max_score": 12.450134,
    "hits": [
        {
            "_index": "vesselsindex",
            "_type": "_doc",
            "_id": "06ad4663-42f6-4771-b350-0d3b7a1b3229",
            "_score": 12.450134,
            "_source": {
                "vesselId": "06ad4663-42f6-4771-b350-0d3b7a1b3229",
                "callSign": "FATA",
                "vesselName": "TAAPE"
            }
        },

结果(使用术语“tuc”时):

{
            "_index": "vesselsindex",
            "_type": "_doc",
            "_id": "e7bea95c-6819-48b1-b52e-0a8fbaeef1df",
            "_score": 11.831188,
            "_source": {
                "vesselId": "e7bea95c-6819-48b1-b52e-0a8fbaeef1df",
                "callSign": "PBAQ",
                "vesselName": "TUCANA"
            }
        },

设置:

{
"vesselsindex": {
    "settings": {
        "index": {
            "number_of_shards": "1",
            "provided_name": "vesselsindex",
            "max_result_window": "50000",
            "creation_date": "1604061335143",
            "analysis": {
                "analyzer": {
                    "keywordWithCaseIgnore": {
                        "filter": [
                            "lowercase"
                        ],
                        "type": "custom",
                        "tokenizer": "keyword"
                    }
                }
            },
            "number_of_replicas": "1",
            "uuid": "M-m3nIB5TqeiPNR2NR5zWQ",
            "version": {
                "created": "7060099"
            }
        }
    }
}

统计数据:

{
"_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
},
"_all": {
    "primaries": {
        "docs": {
            "count": 43510,
            "deleted": 0
        },
        "store": {
            "size_in_bytes": 12762612
        },
n3ipq98p

n3ipq98p1#

这是因为es在默认情况下只返回前10个搜索结果,而当您搜索 T 当时排名前10位的文档可能就是 TUC 查询搜索结果。
如果您想获得更多的搜索结果,请增加size param,这是昂贵的,因此主要用于分页以提高搜索查询的性能。
可以将size参数作为查询参数或作为搜索请求正文的一部分提供。
您可以在您的搜索请求url中尝试\u search?size=44000,它将返回所有搜索结果

相关问题