如何使用摄取管道丰富Elasticsearch中的多个值?

vuktfyat  于 2023-10-17  发布在  ElasticSearch
关注(0)|答案(1)|浏览(134)

我有一个Elasticsearch文档,其中包含一个具有多个值的字段,我想使用摄取管道来丰富每个值。下面是我的文档结构的一个例子:{ "http.rule.id": ["b41912851a064912b2a589f3a21d0c57", "82045c5fd30045d893272fd8b74e93d6"] }
有一个富集指数

{
  "_index": "enrich-content",
  "_id": "b41912851a064912b2a589f3a21d0c57",
  "_score": 1,
  "_source": {
    "description": "description1",
    "name": "name1",
    "location": "location1",
    "id": "b41912851a064912b2a589f3a21d0c57"
  }
},
{
  "_index": "enrich-content",
  "_id": "82045c5fd30045d893272fd8b74e93d6",
  "_score": 1,
  "_source": {
    "description": "description2",
    "name": "name2",
    "location": "location2",
    "id": "82045c5fd30045d893272fd8b74e93d6"
  }
},
{
  "_index": "enrich-content",
  "_id": "eda384884ff545ae957bfccf47aaba1f",
  "_score": 1,
  "_source": {
    "description": "description3",
    "name": "name3",
    "location": "location3",
    "id": "eda384884ff545ae957bfccf47aaba1f"
  }
}

我有两个摄取管道来丰富:
浓缩管道1:

{
  "enrich": {
    "field": "http.rule.id",
    "policy_name": "policy_enrich",
    "target_field": "http.description",
    "ignore_missing": true,
    "ignore_failure": true
  }
}

浓缩管道2:

{
  "foreach": {
    "field": "http.rule.id",
    "processor": {
      "enrich": {
        "field": "_ingest._value",
        "policy_name": "cf-firewall-content",
        "target_field": "http.description",
        "ignore_missing": true,
        "ignore_failure": true
      }
    },
    "ignore_failure": true
  }
}

但是,两个管道都只通过数组中的最后一个ID(82045c5fd30045d893272fd8b74e93d6)进行丰富。
我想实现的是丰富所有的ID,并得到这样的结果:http.description: ["description1", "description2"]
有人能帮我修改我的Elasticsearch摄取管道配置来实现这一点吗?

woobm2wo

woobm2wo1#

您可以利用enrich处理器的max_matches setting来处理数组的所有元素。默认值为1,这就是为什么只匹配一个元素:

{
    "enrich": {
      "field": "http.rule.id",
      "policy_name": "source-policy",
      "target_field": "http.description",
      "max_matches": 2,                         <---- add this
      "ignore_missing": true,
      "ignore_failure": true
    }
  }

然后你会在http.description目标字段中得到另一个数组,每个匹配的id有一个description元素:

{
          "http" : {
            "rule" : {
              "id" : [
                "b41912851a064912b2a589f3a21d0c57",
                "82045c5fd30045d893272fd8b74e93d6"
              ]
            },
            "description" : [
              {
                "description" : "description1",
                "id" : "b41912851a064912b2a589f3a21d0c57"
              },
              {
                "description" : "description2",
                "id" : "82045c5fd30045d893272fd8b74e93d6"
              }
            ]
          }
        }

如果需要的话,您可以添加一个script处理器来以不同的方式“处理”您的描述数组。

相关问题