elasticsearch 强制匹配短语放弃仅搜索其域的完整电子邮件结果

wlzqhblo 于 2023-04-20 发布在 ElasticSearch

关注(0)|答案(1)|浏览(114)

我想在ElasticSearch索引中使用match_phrase查询在文本中找到字符串outlook.com。但我不希望使用此查询获得something...@outlook.com的结果：

GET /my_index/_search
{
  "size": 1,
  "query": {
    "bool": {
      "should": [],
      "must": [
        {
          "match_phrase": {
            "message": {
              "query": "outlook.com",
              "slop": 0
            }
          }
        }
      ]
    }
  }
}

我认为这些结果是因为标准分析仪的标记器将something...@outlook.com分离为[something...],[outlook.com]，并将@作为分隔符。
我试图将分析器whitespace标记为[something...@outlook.com]，并避免将完整的电子邮件作为结果。但使用此查询：

GET /my_index/_search
{
  "size": 1,
  "query": {
    "bool": {
      "should": [],
      "must": [
        {
          "match_phrase": {
            "message": {
              "query": "outlook.com",
              "slop": 0,
              "analyzer": "whitespace",
            }
          }
        }
      ]
    }
  }
}

仍然找到类似something...@outlook.com的结果。我该怎么做？

更新：

我还尝试添加一个自定义分析器（等于一个空白）：

PUT /my_index/_settings
{
  "settings": {
    "analysis": {
      "analyzer": {
        "email_analyzer": {
          "tokenizer": "whitespace",
          "filter": [         
          ]
        }
      }
    }
  }
}

但是在搜索时在分析器中使用它不会改变任何东西

elasticsearch

来源：https://stackoverflow.com/questions/76046485/force-match-phrase-to-discard-results-with-full-email-searching-only-its-domain

1条答案

按热度按时间

ncecgwcz1#

您可以使用regexp查询而不是match_phrase，如下所示：

{  "query":{
    "bool": {
      "must": [
        {
          "regexp": {
            "message": ".*[^@]outlook.com"
          }
        }
      ]
    }
  }
}

赞(0）回复(0）举报 2023-04-20

我来回答

elasticsearch 强制匹配短语放弃仅搜索其域的完整电子邮件结果

1条答案

相关问题

热门标签

最新问答