elasticsearch 我们可以在Elastic Search中添加自定义聚合值吗

sg2wtvxw  于 2023-06-21  发布在  ElasticSearch
关注(0)|答案(1)|浏览(143)

我有三个表Customer、Product和Transactions。我使用Elastic Search来构建连接三个表的索引。索引的输出如下所示

{
    "customer_id": 101,
    "name": "John Doe",
    "transactions":
    [
        {
            "product_id": 11,
            "product_name": "T-Shirt",
            "transaction_id": "TX101",
            "price": "500"
        },
        {
            "product_id": 11,
            "product_name": "T-Shirt",
            "transaction_id": "TX101",
            "price": "600"
        },
        {
            "product_id": 12,
            "product_name": "Shirt",
            "transaction_id": "TX102",
            "price": "1000"
        }
    ]
}

我想一个total_transaction字段,其中包含所有交易的价格总和。
是否可以在建立索引或检索文档时动态添加字段。
我试过聚合,但它们适用于整个数据。聚合函数在每个文档中都不起作用。
我使用Postgres数据库。如果有另一种方法来实现这一点,这也将受到欢迎

2guxujil

2guxujil1#

可以在索引时使用ingest pipelinescript processor来实现这一点:

POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "processors": [
      {
        "script": {
          "description": "Sum transaction priceses",
          "lang": "painless",
          "source": """
                ctx.total_transaction = ctx.transactions.stream().map(x -> Integer.parseInt(x.price)).reduce(0, (a, b) -> a + b);
              """
        }
      }
    ]
  },
  "docs": [
    {
      "_source": {
        "customer_id": 101,
        "name": "John Doe",
        "transactions": [
          {
            "product_id": 11,
            "product_name": "T-Shirt",
            "transaction_id": "TX101",
            "price": "500"
          },
          {
            "product_id": 11,
            "product_name": "T-Shirt",
            "transaction_id": "TX101",
            "price": "600"
          },
          {
            "product_id": 12,
            "product_name": "Shirt",
            "transaction_id": "TX102",
            "price": "1000"
          }
        ]
      }
    }
  ]
}

回复:

{
  "docs" : [
    {
      "doc" : {
        "_source" : {
          "total_transaction" : 2100,
          "name" : "John Doe",
          "customer_id" : 101,
          "transactions" : [
            {
              "transaction_id" : "TX101",
              "product_name" : "T-Shirt",
              "price" : "500",
              "product_id" : 11
            },
            {
              "transaction_id" : "TX101",
              "product_name" : "T-Shirt",
              "price" : "600",
              "product_id" : 11
            },
            {
              "transaction_id" : "TX102",
              "product_name" : "Shirt",
              "price" : "1000",
              "product_id" : 12
            }
          ]
        }
      }
    }
  ]
}

如果transactions不是嵌套的,你也可以在搜索时使用脚本字段来实现这一点,但它的性能会降低,因为脚本需要为每个匹配的文档执行:

GET test/_search
{
  "_source": true,
  "query": {
    "match_all": {}
  },
  "script_fields": {
    "total_transaction": {
      "script": {
        "lang": "painless",
        "source": "doc['transactions.price'].stream().reduce(0, (a, b) -> a + b)"
      }
    }
  }
}

回复:

{
  "hits" : {
    "hits" : [
      {
        "_source" : {
          "name" : "John Doe",
          "customer_id" : 101,
          "transactions" : [
            {
              "transaction_id" : "TX101",
              "product_name" : "T-Shirt",
              "price" : 500,
              "product_id" : 11
            },
            {
              "transaction_id" : "TX101",
              "product_name" : "T-Shirt",
              "price" : 600,
              "product_id" : 11
            },
            {
              "transaction_id" : "TX102",
              "product_name" : "Shirt",
              "price" : 1000,
              "product_id" : 12
            }
          ]
        },
        "fields" : {
          "total_transaction" : [
            2100
          ]
        }
      }
    ]
  }
}

相关问题