如何使用elasticsearch索引文档的特定字段

irtuqstp  于 2021-06-15  发布在  ElasticSearch
关注(0)|答案(1)|浏览(461)

我的要求是在elasticsearch中存储要索引的文档的特定字段。示例:我的文档是

{
  "name":"stev",
  "age":26,
  "salary":25000
}

这是我的文档,但我不想索引整个文档。我只想存储名称字段。我创建了一个索引emp并编写了如下Map

"person" : {
    "_all" : {"enabled" : false},
    "properties" : {
        "name" : {
            "type" : "string", "store" : "yes"
        }
    }
}

当看到索引文档时

{

    "took": 1,
    "timed_out": false,
    "_shards": {
        "total": 2,
        "successful": 2,
        "failed": 0
    },
    "hits": {
        "total": 2,
        "max_score": 1,
        "hits": [
            {
                "_index": "test",
                "_type": "test",
                "_id": "AU1_p0xAq8r9iH00jFB_",
                "_score": 1,
                "_source": { }
            }
            ,
            {
                "_index": "test",
                "_type": "test",
                "_id": "AU1_lMDCq8r9iH00jFB-",
                "_score": 1,
                "_source": { }
            }
        ]
    }
}

没有生成名称字段,为什么?有人帮我吗

gjmwrych

gjmwrych1#

很难从你的帖子中看出你做错了什么,但我可以给你一个有效的例子。
默认情况下,elasticsearch将为您提供的任何源文档编制索引。每当它看到一个新的文档字段时,它都会创建一个带有合理默认值的Map字段,并在默认情况下对它们进行索引。如果要排除字段,可以设置 "index": "no" 以及 "store": "no" 在要排除的每个字段的Map中。如果希望将该行为作为每个字段的默认行为,可以使用 "_default_" 属性指定不存储字段(尽管我无法使其工作,因为没有索引)。
您可能还需要禁用 "_source" ,并使用 "fields" 搜索查询中的参数。
下面是一个例子。索引定义如下所示:

PUT /test_index
{
   "mappings": {
      "person": {
         "_all": {
            "enabled": false
         },
         "_source": {
            "enabled": false
         },
         "properties": {
            "name": {
               "type": "string",
               "index": "analyzed", 
               "store": "yes"
            },
            "age": {
                "type": "integer",
                "index": "no",
                "store": "no"
            },
            "salary": {
                "type": "integer",
                "index": "no",
                "store": "no"
            }
         }
      }
   }
}

然后我可以使用批量api添加一些文档:

POST /test_index/person/_bulk
{"index":{"_id":1}}
{"name":"stev","age":26,"salary":25000}
{"index":{"_id":2}}
{"name":"bob","age":30,"salary":28000}
{"index":{"_id":3}}
{"name":"joe","age":27,"salary":35000}

因为我残疾了 "_source" ,简单查询将仅返回ID:

POST /test_index/_search
...
{
   "took": 1,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 3,
      "max_score": 1,
      "hits": [
         {
            "_index": "test_index",
            "_type": "person",
            "_id": "1",
            "_score": 1
         },
         {
            "_index": "test_index",
            "_type": "person",
            "_id": "2",
            "_score": 1
         },
         {
            "_index": "test_index",
            "_type": "person",
            "_id": "3",
            "_score": 1
         }
      ]
   }
}

但是如果我指定我想要 "name" 菲尔德,我去拿:

POST /test_index/_search
{
   "fields": [
      "name"
   ]
}
...
{
   "took": 1,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 3,
      "max_score": 1,
      "hits": [
         {
            "_index": "test_index",
            "_type": "person",
            "_id": "1",
            "_score": 1,
            "fields": {
               "name": [
                  "stev"
               ]
            }
         },
         {
            "_index": "test_index",
            "_type": "person",
            "_id": "2",
            "_score": 1,
            "fields": {
               "name": [
                  "bob"
               ]
            }
         },
         {
            "_index": "test_index",
            "_type": "person",
            "_id": "3",
            "_score": 1,
            "fields": {
               "name": [
                  "joe"
               ]
            }
         }
      ]
   }
}

您可以通过运行以下命令来证明其他字段没有存储:

POST /test_index/_search
{
   "fields": [
      "name", "age", "salary"
   ]
}

它将返回相同的结果。我也可以证明 "age" 字段未通过运行此查询编入索引,如果 "age" 已编制索引:

POST /test_index/_search
{
   "fields": [
      "name", "age"
   ],
   "query": {
      "term": {
         "age": {
            "value": 27
         }
      }
   }
}
...
{
   "took": 1,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 0,
      "max_score": null,
      "hits": []
   }
}

这是我用来玩这个的一堆代码。我想用一个 _default Map和/或字段来处理此问题,而不必为每个字段指定设置。我可以让它不存储数据,但每个字段仍然被索引。
http://sense.qbox.io/gist/d84967923d6c0757dba5f44240f47257ba2fbe50

相关问题