php 搜索包含同义词的精确短语

x759pob2  于 2022-11-28  发布在  PHP
关注(0)|答案(1)|浏览(210)

我正在尝试建立一个查询,我使用精确的短语匹配和同义词,我不能弄清楚它。而且,当使用通配符的方法,我不知道如何使用模糊性。它甚至可能与通配符?这将是伟大的,以获得相同的结果为术语“使命召唤”,“鳕鱼”或“呼叫的dutz”。
我创建了此索引:

PUT exact_search
{
  "settings": {
    "index": {
      "number_of_shards": "1",
      "number_of_replicas": "0",
      "analysis": {
        "analyzer": {
          "analyzer_exact": {
            "type": "custom",
            "tokenizer": "keyword",
            "filter": [
              "lowercase",
              "icu_folding",
              "synonyms"
            ]
          }
        },
        "filter": {
          "synonyms": {
            "type": "synonym",
            "synonyms_path": "synonyms.txt"
          }
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "name": {
        "type": "keyword",
        "fields": {
          "analyzer_exact": {
            "type": "text",
            "analyzer": "analyzer_exact"
          }
        }
      }
    }
  }
}

我把这些东西填进去:

POST exact_search/_doc/1
{
  "name": "Hoodie Call of Duty"
}
POST exact_search/_doc/2
{
  "name": "Call of Duty 2"
}
POST exact_search/_doc/3
{
  "name": "Call of Duty: Modern Warfare 2"
}
POST exact_search/_doc/4
{
  "name": "COD: Modern Warfare 2"
}
POST exact_search/_doc/5
{
  "name": "Call of duty"
}
POST exact_search/_doc/6
{
  "name": "Call of the sea"
}
POST exact_search/_doc/7
{
  "name": "Heavy Duty"
}

synonyms.txt看起来像这样:

cod,call of duty

而我试图实现的是,当我搜索“使命召唤”或“鳕鱼”时,得到所有的结果(除了大海的召唤和重型)。
到目前为止,我构建了这个查询,但是当使用“cod”搜索词时,它不能像预期的那样工作(术语“call of duty”工作正常):

GET exact_search/_search
{
  "explain": false, 
  "query":{
    "bool":{
      "must":[
         {
           "wildcard": {
             "name.analyzer_exact": {
               "value": "*cod*"
             }
           }
         }
      ]
    }
  }
}

但结果只有两项:

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "exact_search",
        "_id" : "4",
        "_score" : 1.0,
        "_source" : {
          "name" : "COD: Modern Warfare 2"
        }
      },
      {
        "_index" : "exact_search",
        "_id" : "5",
        "_score" : 1.0,
        "_source" : {
          "name" : "Call of duty"
        }
      }
    ]
  }
}

它看起来像同义词工作,因为它返回“使命召唤”游戏,但它忽略了通配符-它不会返回使命召唤2例如。
我需要寻找精确的短语匹配,因为我不想得到结果重型或调用的海洋(当单词“调用”和“责任”匹配)。
谢谢你给我指明了方向。

wswtfjt7

wswtfjt71#

我怀疑分析器是否会生成与analyzer_exact“tokenizer”同义的标记:“关键字”。我会改变一些东西,使它的工作。
1.关键字-〉标准

"analyzer_exact": {
    "type": "custom",
    "tokenizer": "standard",
    "filter": [
      "lowercase",
      "synonyms"
    ]
  }

1.我会使用匹配短语来排除除使命召唤和鳕鱼以外名字。

{
   "match_phrase": {
     "name.analyzer_exact": "cod"
   }
 }

变更后响应

{
  "hits": {
    "hits": [
      {
        "_source": {
          "name": "Call of duty"
        }
      },
      {
        "_source": {
          "name": "COD: Modern Warfare 2"
        }
      },
      {
        "_source": {
          "name": "Call of Duty 2"
        }
      },
      {
        "_source": {
          "name": "hoddies Call of Duty"
        }
      },
      {
        "_source": {
          "name": "Call of Duty: Modern Warfare 2"
        }
      }
    ]
  }

相关问题