使用elasticsearch文本类型字段

njthzxwz  于 2023-08-03  发布在  ElasticSearch
关注(0)|答案(1)|浏览(96)

数据详情:

{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 3.7750573,
    "hits": [
      {
        "_index": "myindex",
        "_id": "1650421750907600896",
        "_score": 3.7750573,
        "_source": {
          "areaCodeList": "350112201201,0,350112201202"
        }
      }
    ]
  }
}

字符串
areaCodeList是一个使用ik标记器的文本字段:

POST /myindex/_analyze
{
  "field": "areaCodeList",
  "text": "350112201201,0,350112201202"
}
{
  "tokens": [
    {
      "token": "350112201201,0,350112201202",
      "start_offset": 0,
      "end_offset": 27,
      "type": "ARABIC",
      "position": 0
    },
    {
      "token": "350112201201",
      "start_offset": 0,
      "end_offset": 12,
      "type": "LETTER",
      "position": 1
    },
    {
      "token": "0",
      "start_offset": 13,
      "end_offset": 14,
      "type": "LETTER",
      "position": 2
    },
    {
      "token": "350112201202",
      "start_offset": 15,
      "end_offset": 27,
      "type": "LETTER",
      "position": 3
    }
  ]
}

的数据
最后,我使用下面的查询语句,但结果是空的:

GET myindex/_search
{
  "query": {
    "match": {
      "areaCodeList": "350112201201"
    }
  },
  "_source": ["areaCodeList"]
}


如何匹配逗号分隔的数据?

wwwo4jvm

wwwo4jvm1#

您可以使用pattern analyzer。它通过所有非单词字符标记文本。
模式分析器使用正则表达式将文本拆分为术语。正则表达式应该匹配标记分隔符,而不是标记本身。正则表达式默认为\W+(或所有非单词字符)。https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-pattern-analyzer.html

POST _analyze
{
  "tokenizer": "pattern",
    "text": "350112201201,0,350112201202"
}

PUT test_code_list
{
  "mappings": {
    "properties": {
      "areaCodeList": {
        "type": "text",
        "analyzer": "pattern"
      }
    }
  }
}
PUT test_code_list/_doc/1
{
  "areaCodeList": "350112201201,0,350112201202"
}

GET test_code_list/_search
{
  "query": {
    "match": {
      "areaCodeList": "350112201201"
    }
  }
}

字符串


的数据


相关问题