elasticsearch 获取聚合存储桶中的分页查询文档

pprl5pva  于 2023-08-03  发布在  ElasticSearch
关注(0)|答案(2)|浏览(99)

我正在研究ElasticSearch的GeoTile。在将位置分组到存储桶中之后,我想通过分页(使用searchafter)获取该存储桶中的数据。有没有人这样做,我怎么才能做到呢?谢谢你,谢谢
以下是我使用的GeoTile聚合:

GET /index-name/_doc/_search
{
  "aggs": {
     "result": {
        "geotile_grid": {
          "field": "location",
          "precision": 12
        }
     }
   }
}

字符串
结果看起来像:

{
  "took" : 3,
  "hits" : {
    "total" : {
      "value" : 39,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ... ]
  },
  "aggregations" : {
    "result" : {
      "buckets" : [
        {
          "key" : "12/3519/1597",
          "doc_count" : 36
        },
        {
          "key" : "12/3520/1597",
          "doc_count" : 3
        }
      ]
    }
  }
}


例如,如何在“12/3519/1597”存储桶中获取36个文档?谢谢你,谢谢
我已经尝试过将GeoTile键“12/3519/1597”转换为边界框follow this article或使用ESearch代码中的GeoTileUtils
然而,从上面的例子中,关键字“12/3519/1597”被转换为一个边界框,当我查询该框中的所有文档时,有2个桶。x=3520存储桶包含lon=129.375中的文档,这些文档正好位于right edge上。

vybvopom

vybvopom1#

您可以嵌套top hits聚合来获取每个地理切片桶的文档。
您还可以使用geo grid query来过滤每个磁贴的文档。

GET kibana_sample_data_logs/_search
{
  "size": 1,
  "query": {
    "bool": {
      "must": [],
      "filter": [
        {
          "geo_grid": {
            "geo.coordinates": {
              "geotile": "5/9/12"
            }
          }
        }
      ],
      "should": [],
      "must_not": []
    }
  }
}

字符串
响应

{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 675,
      "relation": "eq"
    },
    "max_score": 0,
    "hits": [
      {
        "_index": ".ds-kibana_sample_data_logs-2023.07.12-000001",
        "_id": "NM-ISokB7DQkCI7yJZQ-",
        "_score": 0,
        "_source": {
          "agent": "Mozilla/5.0 (X11; Linux x86_64; rv:6.0a1) Gecko/20110421 Firefox/6.0a1",
          "bytes": 8973,
          "clientip": "213.50.214.248",
          "extension": "rpm",
          "geo": {
            "srcdest": "US:VN",
            "src": "US",
            "dest": "VN",
            "coordinates": {
              "lat": 40.19349528,
              "lon": -76.76340361
            }
          },
          "host": "artifacts.elastic.co",
          "index": "kibana_sample_data_logs",
          "ip": "213.50.214.248",
          "machine": {
            "ram": 12884901888,
            "os": "win 8"
          },
          "memory": null,
          "message": "213.50.214.248 - - [2018-09-10T11:39:18.812Z] \"GET /beats/metricbeat/metricbeat-6.3.2-i686.rpm HTTP/1.1\" 200 8973 \"-\" \"Mozilla/5.0 (X11; Linux x86_64; rv:6.0a1) Gecko/20110421 Firefox/6.0a1\"",
          "phpmemory": null,
          "referer": "http://www.elastic-elastic-elastic.com/success/daniel-tani",
          "request": "/beats/metricbeat/metricbeat-6.3.2-i686.rpm",
          "response": 200,
          "tags": [
            "success",
            "info"
          ],
          "@timestamp": "2023-08-21T11:39:18.812Z",
          "url": "https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-6.3.2-i686.rpm",
          "utc_time": "2023-08-21T11:39:18.812Z",
          "event": {
            "dataset": "sample_web_logs"
          },
          "bytes_gauge": 8973,
          "bytes_counter": 65621715
        }
      }
    ]
  }
}

jdgnovmf

jdgnovmf2#

对于较新的ES版本(从8.8开始),您可以使用@Nathan Reese解决方案。
然而,在较低版本(我的版本是7.10)中,我使用了Elastic搜索的GeoTileUtils将geotile键(z/x/y)转换为边界框。
但你必须知道边界框的边缘。地理切片聚合不采用右边缘和下边缘上的位置(点)。为了排除边缘上的点,我使用了一个轻松的脚本,如下所示:

GET /index-name/_doc/_search
{
  "size": 3,
  "query": {
    "bool": {
      "filter": [
        { 
          "geo_bounding_box": {
            "location": {
              "top_left": {
                "lat": 36.80928470205938, "lon": 129.287109375
                },
              "bottom_right": {
                "lat": 36.73888412439431, "lon": 129.37500
              }
            }
          }
        },
        {
          "script": {
            "script": {
              "source": "doc['location'].lon < params.maxLon && doc['location'].lat < params.minLat",
              "lang": "painless",
              "params": {
                "minLat": 36.80928470205938,
                "maxLon": 129.37500
              }
            }
          }
        }
      ]
    }
  }
}

字符串

相关问题