如何防止某些字段在elasticsearch中被索引

htzpubme 于 2021-06-10 发布在 ElasticSearch

关注(0)|答案(1)|浏览(543)

我需要防止某些字段在elasticsearch中被索引，这些字段的值像“null”（字符串为null）和“”（空字符串），也就是说，我应该能够获取文档中的其余字段，但在源中具有此类值的字段除外。我使用规范化器如下

{
"analysis": {
    "normalizer": {
        "my_normalizer": {
            "filter": [
                "uppercase"
            ],
            "type": "custom"
        }
    }
}

}
上面或字段Map中是否需要任何设置？
p、 s:-我正在使用elasticsearch 7.6.1

elasticsearch elasticsearch-mapping

来源：https://stackoverflow.com/questions/64716730/how-to-prevent-certain-fields-from-get-indexed-in-elasticsearch

1条答案

按热度按时间

xtfmy6hx1#

你可以看看elasticsearch管道。它们是在索引（在您的案例分析中）发生之前应用的。
具体来说，您可以添加一个elasticsearch管道，如果所需字段满足您列出的条件，它将删除这些字段。比如：

PUT _ingest/pipeline/remove_invalid_value
{
   "description": "my pipeline that removes empty string and null strings",
   "processors": [
       { 
          "remove": {
              "field": "field1",
              "ignore_missing": true,
              "if": "ctx.field1 == \"null\" || ctx.field1 == \"\""
          }
       },
        { 
          "remove": {
              "field": "field2",
              "ignore_missing": true,
              "if": "ctx.field2 == \"null\" || ctx.field2 == \"\""
          }
       },

        { 
          "remove": {
              "field": "field3",
              "ignore_missing": true,
              "if": "ctx.field3 == \"null\" || ctx.field3 == \"\""
          }
       }
   ]
}

然后，您可以在索引请求中指定管道，也可以将其作为 default_pipeline 或者 final_pipeline 在索引设置中。也可以在索引模板中指定此设置。

（脚本）循环方法

如果不想编写一长串删除操作，可以尝试使用脚本处理器，如下所示：

PUT _ingest/pipeline/remove_invalid_fields
{
  "description": "remove fields",
  "processors": [
    {
      "script": {
        "source": """
          for (x in params.to_delete_on_condition) {
                if (ctx[x] == "null" || ctx[x] == "") {
                    ctx.remove(x);
                }
          }
          """,
        "params": {
          "to_delete_on_condition": [
            "field1",
            "field2",
            "field3"
          ]
        }
      }
    }
  ]
}

如果条件匹配，它遍历列表并删除字段。
访问脚本中的嵌套字段并不像许多答案中所报告的那样微不足道，但它应该是可行的。我们的想法是 nested.field 应访问为 ctx['nested']['field'] .

赞(0）回复(0）举报 2021-06-11

我来回答

如何防止某些字段在elasticsearch中被索引

1条答案

（脚本）循环方法

相关问题

热门标签

最新问答