ElasticSearch:是否可以根据子聚合的结果过滤聚合?

yzuktlbb  于 2023-10-17  发布在  ElasticSearch
关注(0)|答案(1)|浏览(125)

我想知道是否有可能根据两个不同子聚合的结果来过滤聚合。这里的用例是在搜索交易索引时获得具有非零值的帐户列表。其中事务_source看起来类似于以下内容:

"_source": {
  "debit": 50,
  "account_id": "account-id-1"
}

"_source": {
  "credit": 50,
  "account_id": "account-id-1"
}

"_source": {
  "debit": 25,
  "account_id": "account-id-2"
}

"_source": {
  "credit": 50,
  "account_id": "account-id-2"
}

目前我使用的聚合查询是这样的

{
  "size": 0,
  "aggs": {
    "terms_account_id": {
      "terms": {
        "field": "account_id"
      },
      "aggs": {
        "sum_debit": {
          "sum": {
            "field": "debit"
          }
        },
        "sum_credit": {
          "sum": {
            "field": "credit"
          }
        }
      }
    }
  }
}

结果是这样的

"aggregations": {
    "terms_account_id": {      
      "buckets": [
        {
          "key": "account-id-1",
          "sum_credit": { "value": 50 },
          "sum_debit": { "value": 50 }
        },
        {
          "key": "account-id-2"
          "sum_credit": { "value": 50 },
          "sum_debit": { "value": 25 }
        }
      ]
    }
  }

然后我循环遍历terms_account_id.buckets,并比较它们的sum_debitsum_credit数量,以找到具有非零值的帐户。
我想知道是否可以通过某种聚合过滤器来过滤terms_account_id聚合,从而查看sum_creditsum_debit子聚合,如果它们相等,则从结果中忽略它们,因此最终结果如下所示

"aggregations": {
    "terms_account_id": {      
      "buckets": [
        {
          "key": "account-id-2"
          "sum_credit": { "value": 50 },
          "sum_debit": { "value": 25 }
        }
      ]
    }
  }

或甚至

"aggregations": {
    "terms_account_id": {      
      "buckets": [
        {
          "key": "account-id-2"
          "sum_total": { "value": 25 },
        }
      ]
    }
  }

我一直在阅读Elasticsearch文档,我已经看到了基于_source字段过滤聚合的可能性,但我找不到任何关于基于子聚合过滤的内容。由于我所拥有的代码目前正在工作,这可能只是利用Elasticsearch的预期方式,但我想知道是否可以在搜索中做更多的工作,以便不返回不需要的值。

ddarikpa

ddarikpa1#

最后使用Bucket选择器聚合和一小段代码来比较其他两个聚合。

{
  "size": 0,
  "aggs": {
    "agg_terms_account_id": {
      "terms": {
        "field": "account_id"
      },
      "aggs": {
        "agg_sum_debit": {
          "sum": {
            "field": "debit"
          }
        },
        "agg_sum_credit": {
          "sum": {
            "field": "credit"
          }
        },
        "agg_bucket_selector_null": {
          "bucket_selector": {
            "buckets_path": {
              "aggSumCredit": "agg_sum_credit",
              "aggSumDebit": "agg_sum_debit"
            },
            "script": "(params.aggSumCredit == null ? 0 : params.aggSumCredit) != (params.aggSumDebit == null ? 0 : params.aggSumDebit)"
          }
        }
      }
    }
  }
}

我还认为我应该附加bodybuilder.js语法,因为这是我最终使用的:

const body = bodybuilder()
    .aggregation('terms', 'account_id', (a) => {
      a.aggregation('sum', 'debit')
      a.aggregation('sum', 'credit')
      return a.aggregation('bucket_selector', null, {
        buckets_path: {
          aggSumCredit: 'agg_sum_credit',
          aggSumDebit: 'agg_sum_debit'
        },
        script: '(params.aggSumCredit == null ? 0 : params.aggSumCredit) != (params.aggSumDebit == null ? 0 : params.aggSumDebit)'
      })
    })
    .size(0)

相关问题