在ElasticSearch中是否可以计算“不同和”与“不同平均”?

rm5edbpk  于 2023-03-01  发布在  ElasticSearch
关注(0)|答案(1)|浏览(147)

如何在elasticsearch中计算“不同平均值”?我有一些非规格化数据,如下所示:

{ "record_id" : "100", "cost" : 42 }
{ "record_id" : "200", "cost" : 67 }
{ "record_id" : "200", "cost" : 67 }
{ "record_id" : "200", "cost" : 67 }
{ "record_id" : "400", "cost" : 11 }
{ "record_id" : "400", "cost" : 11 }
{ "record_id" : "500", "cost" : 10 }
{ "record_id" : "600", "cost" : 99 }

注意对于给定的“record_id”,“成本”总是相同的。
因此,根据上述数据:
1.如何获得“成本”字段的平均值,但按“record_id”区分?结果将是(42+67+11+10+99)/5=45.8
1.如何获得“cost”字段的SUM值,但按“record_id”区分?结果将是42+67+11+10+99=229
我可以使用“terms”聚合,然后是“first”和“average”子聚合的组合吗?我的想法是这样的:elasticsearch calculate average of unique values

polhcujo

polhcujo1#

它不适用于terms aggs。下面是使用无痛脚本可以实现的功能:
索引--您的实际Map可能与生成的默认Map不同(特别是rec_id上的.keyword部分):

POST _bulk
{"index":{"_index":"uniques","_type":"_doc"}}
{"record_id":"100","cost":42}
{"index":{"_index":"uniques","_type":"_doc"}}
{"record_id":"200","cost":67}
{"index":{"_index":"uniques","_type":"_doc"}}
{"record_id":"200","cost":67}
{"index":{"_index":"uniques","_type":"_doc"}}
{"record_id":"200","cost":67}
{"index":{"_index":"uniques","_type":"_doc"}}
{"record_id":"400","cost":11}
{"index":{"_index":"uniques","_type":"_doc"}}
{"record_id":"400","cost":11}
{"index":{"_index":"uniques","_type":"_doc"}}
{"record_id":"500","cost":10}
{"index":{"_index":"uniques","_type":"_doc"}}
{"record_id":"600","cost":99}

然后聚合

GET uniques/_search
{
  "size": 0,
  "aggs": {
    "terms": {
      "scripted_metric": {
        "init_script": "state.id_map = [:]; state.sum = 0.0; state.elem_count = 0.0;",
        "map_script": """
          def id = doc['record_id.keyword'].value;
          if (!state.id_map.containsKey(id)) {
            state.id_map[id] = true;
            state.elem_count++;
            state.sum += doc['cost'].value;
          }
        """,
        "combine_script": """
            def sum = state.sum;
            def avg = sum / state.elem_count;
            
            def stats = [:];
            stats.sum = sum;
            stats.avg = avg;
            
            return stats
        """,
        "reduce_script": "return states"
      }
    }
  }
}

屈服

...
"aggregations" : {
    "terms" : {
      "value" : [
        {
          "avg" : 45.8,
          "sum" : 229.0
        }
      ]
    }
  }

相关问题