如果ElasticSearch文档中缺少字段,则返回值

mwkjh3gx  于 2023-02-03  发布在  ElasticSearch
关注(0)|答案(1)|浏览(144)

我有一个索引用户。我正在LOCATION_ID和_source上筛选用户,该用户由PHONE_NUMBER和USER_ID组成,一些文档没有PHONE_NUMBER数据。因此,它只返回USER_ID响应。如果文档中缺少字段PHONE_NUMBER,是否有任何方法可以为字段PHONE_NUMBER获取一些默认值或预定义值(传入查询,就像我们对缺少字段的计数一样)。
Map:

{
  "PHONE_NUMBER": {
    "type": "long",
    "store": true
  },
  "USER_ID": {
    "type": "long",
    "store": true
  },
  "LOCATION_ID": {
    "type": "long",
    "store": true
  }
}

质询:

{
  "_source":[
     "PHONE_NUMBER",
     "USER_ID"
  ],
  "query":{
     "bool":{
        "must":[
           {
              "terms":{
                 "LOCATION_ID":[
                    "5001"
                 ]
              }
           }
        ],
        "must_not":[
           
        ]
     }
  },
  "from":0,
  "size":2000
}

回复:

{
  "took":0,
  "timed_out":false,
  "_shards":{
     "total":1,
     "successful":1,
     "skipped":0,
     "failed":0
  },
  "hits":{
     "total":{
        "value":4,
        "relation":"eq"
     },
     "max_score":2.0,
     "hits":[
        {
           "_index":"user",
           "_id":"39788",
           "_score":2.0,
           "_source":{
              "USER_ID":39788
           }
        },
        {
           "_index":"user",
           "_id":"30784",
           "_score":2.0,
           "_source":{
              "USER_ID":30784,
              "PHONE_NUMBER":1234567890
           }
        },
        {
           "_index":"user",
           "_id":"36373",
           "_score":2.0,
           "_source":{
              "USER_ID":36373,
              "PHONE_NUMBER":1234567893
           }
        },
        {
           "_index":"user",
           "_id":"36327",
           "_score":2.0,
           "_source":{
              "USER_PROJECT_USER_ID":36327
           }
        }
     ]
  }
}

在上面的响应中,第一个最后一个文档中缺少PHONE_NUMBER。如果缺少字段,我希望返回一些默认值或预定义值(在查询中设置,就像我们在计算缺少字段时所做的那样)。
预期React:

{
  "took":0,
  "timed_out":false,
  "_shards":{
     "total":1,
     "successful":1,
     "skipped":0,
     "failed":0
  },
  "hits":{
     "total":{
        "value":4,
        "relation":"eq"
     },
     "max_score":2.0,
     "hits":[
        {
           "_index":"user",
           "_id":"39788",
           "_score":2.0,
           "_source":{
              "USER_ID":39788,
              "PHONE_NUMBER":9876543210.     <- Default or Predifined value (set in query, like we do in count for missing field)
           }
        },
        {
           "_index":"user",
           "_id":"30784",
           "_score":2.0,
           "_source":{
              "USER_ID":30784,
              "PHONE_NUMBER":1234567890
           }
        },
        {
           "_index":"user",
           "_id":"36373",
           "_score":2.0,
           "_source":{
              "USER_ID":36373,
              "PHONE_NUMBER":1234567893
           }
        },
        {
           "_index":"user",
           "_id":"36327",
           "_score":2.0,
           "_source":{
              "USER_PROJECT_USER_ID":36327,
              "PHONE_NUMBER":9876543210      <- Default or Predifined value (set in query, like we do in count for missing field)
           }
        }
     ]
  }
}

任何帮助都将不胜感激。

ego6inou

ego6inou1#

尾巴

查询时无法获得此精确结果。您必须在接收期间执行此操作。
但是有一个解决方案在查询时足够接近,它使用runtime fields

溶液

1.摄取时

您可以将Map设置为:

{
  "PHONE_NUMBER": {
    "type": "long",
    "store": true,
    "null_value": "9876543210" <- the specific / default number
  },
  "USER_ID": {
    "type": "long",
    "store": true
  },
  "LOCATION_ID": {
    "type": "long",
    "store": true
  }
}

没有电话号码的文档现在将有一个默认值。缺点是,它不是动态的。您不能在查询时更新此值。

2.查询时

设置:

POST /_bulk
{"index":{"_index":"75278567"}}
{"USER_ID":123456,"PHONE_NUMBER":12345}
{"index":{"_index":"75278567"}}
{"USER_ID":234567,"PHONE_NUMBER":234567}
{"index":{"_index":"75278567"}}
{"USER_ID":345678}
{"index":{"_index":"75278567"}}
{"USER_ID":456789}

使用运行时字段,您可以创建以下查询:

GET /75278567/_search
{
  "runtime_mappings": {
    "NUMBER": {
      "type": "keyword",
      "script": {
        "source": """
        if (doc["PHONE_NUMBER"].size() == 0){
          emit("000000")
        } else
        {
          emit(doc["PHONE_NUMBER"].value.toString())
        }
        """
      }
    }
  },
  "fields": [
    "NUMBER"
  ]
}

这将得到以下结果:

{
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 4,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "75278567",
        "_id": "3njWAYYBArbKoMpIcXFp",
        "_score": 1,
        "_source": {
          "USER_ID": 123456,
          "PHONE_NUMBER": 12345
        },
        "fields": {
          "NUMBER": [
            "12345"
          ]
        }
      },
      {
        "_index": "75278567",
        "_id": "33jWAYYBArbKoMpIcXFp",
        "_score": 1,
        "_source": {
          "USER_ID": 234567,
          "PHONE_NUMBER": 234567
        },
        "fields": {
          "NUMBER": [
            "234567"
          ]
        }
      },
      {
        "_index": "75278567",
        "_id": "4HjWAYYBArbKoMpIcXFp",
        "_score": 1,
        "_source": {
          "USER_ID": 345678
        },
        "fields": {
          "NUMBER": [
            "000000"
          ]
        }
      },
      {
        "_index": "75278567",
        "_id": "4XjWAYYBArbKoMpIcXFp",
        "_score": 1,
        "_source": {
          "USER_ID": 456789
        },
        "fields": {
          "NUMBER": [
            "000000"
          ]
        }
      }
    ]
  }
}

这不在_source中,但您可以访问fields中的默认值。

相关问题