elasticsearch 获取特定事件最常出现的列表并排除其他事件

s6fujrry  于 2023-01-25  发布在  ElasticSearch
关注(0)|答案(1)|浏览(126)

我有一个索引,其中记录了特定项目的成功/失败标志的数据。我想获得失败和从未成功的项目的唯一列表。通常从时间Angular 来看,失败后预期会成功
样品输入
| 日期|用户|现况|
| - ------|- ------|- ------|
| 一月一日|用户1|未通过|
| 2002年1月|用户1|成功|
| 一月一日|用户2|未通过|
| 2002年1月|用户2|未通过|
输出将仅为user 2

4si2a6ki

4si2a6ki1#

尾巴;

我不确定是否有简单的方法可以做到这一点,纯粹使用Elasticsearch,但是您可以获取每个用户的所有状态,然后过滤包含success的状态。

溶液

要设置:

POST _bulk
{"index":{"_index":"75225454"}}
{"user":"user 1","status":"success"}
{"index":{"_index":"75225454"}}
{"user":"user 1","status":"success"}
{"index":{"_index":"75225454"}}
{"user":"user 2","status":"failure"}
{"index":{"_index":"75225454"}}
{"user":"user 2","status":"failure"}
{"index":{"_index":"75225454"}}
{"user":"user 3","status":"success"}
{"index":{"_index":"75225454"}}
{"user":"user 3","status":"failure"}
{"index":{"_index":"75225454"}}
{"user":"user 4","status":"failure"}

下面是查询

GET /75225454/_search
{
  "size": 0,
  "aggs": {
    "users": {
      "terms": {
        "field": "user.keyword",
        "size": 10
      },
      "aggs": {
        "status": {
          "terms": {
            "field": "status.keyword",
            "size": 10
          }
        }
      }
    }
  }
}

你应该得到:

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 7,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "users": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "user 1",
          "doc_count": 2,
          "status": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "success",
                "doc_count": 2
              }
            ]
          }
        },
        {
          "key": "user 2",
          "doc_count": 2,
          "status": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "failure",
                "doc_count": 2
              }
            ]
          }
        },
        {
          "key": "user 3",
          "doc_count": 2,
          "status": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "failure",
                "doc_count": 1
              },
              {
                "key": "success",
                "doc_count": 1
              }
            ]
          }
        },
        {
          "key": "user 4",
          "doc_count": 1,
          "status": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "failure",
                "doc_count": 1
              }
            ]
          }
        }
      ]
    }
  }
}

现在,您只需筛选出具有

  1. status.buckets > 1
  2. status.buckets[0].key == 'success'
    你应该很好。

相关问题