日期直方图聚合ElasticSearch

7jmck4yq  于 2021-06-10  发布在  ElasticSearch
关注(0)|答案(2)|浏览(372)

我想从ElasticSearch中过滤和获取数据。我试过了,但没有解决我的目的。我有如下数据:

[
   {
      "id":1,
      "title":"Sample news",
      "date":"2020-09-17",
      "regulation":[
         {
            "id":1,
            "name":"sample name",
            "date":"2020-09-17"
         },
         {
            "id":2,
            "name":"sample name 1",
            "date":"2020-09-18"
         }
      ]
   },
   {
      "id":2,
      "title":"Sample news 1",
      "date":"2020-09-17",
      "regulation":[
         {
            "id":1,
            "name":"sample name",
            "date":"2020-09-18"
         },
         {
            "id":2,
            "name":"sample name 1",
            "date":"2020-09-17"
         }
      ]
   }
]

我想过滤和获取如下数据:

year: {
  month: {
   day: {
    news: int,
    regulations: int,
   }
 }
}

这意味着每天的新闻和法规都被算作一个日期层次。我可以得到这样的数据:

"2020-09-17" : {
          "key_as_string" : "2020-09-17",
          "key" : 1600300800000,
          "doc_count" : 1
        },
        "2020-09-18" : {
          "key_as_string" : "2020-09-18",
          "key" : 1600387200000,
          "doc_count" : 0
        },
        "2020-09-19" : {
          "key_as_string" : "2020-09-19",
          "key" : 1600473600000,
          "doc_count" : 0
        },

使用

GET /news/_search?size=0
{
  "aggs": {
    "news_over_time": {
      "date_histogram": {
        "field": "date",
        "calendar_interval": "day",
        "keyed": true,
        "format": "yyy-MM-dd"
      }
    }
  }
}

但这不能解决我的目的。如何使用elasticsearch和elasticsearch dsl实现这一点
预期响应:预期响应:

2020: {
  09: {
   17: {
    news: 2,
    regulation: 2
   },
   18: {
    news: 0,
    regulation: 2
   }
 }
}
2mbi3lxu

2mbi3lxu1#

因为new date和regulation date是两个不同的字段&其中一个属于父文档,另一个属于嵌套文档。我不完全确定我们是否能直接满足你的要求(我自己也在探索同样的问题)。不过,下面的查询也应该适用于您。

GET news/_search
{
  "size": 0, 
  "aggs": {
    "news_over_time": {
      "date_histogram": {
        "field": "date",
        "calendar_interval": "day",
        "keyed": true,
        "format": "yyy-MM-dd"
      }
    },"regulations_over_time":{
      "nested": {
        "path": "regulation"
      },"aggs": {
        "regulation": {
          "date_histogram": {
            "field": "regulation.date",
            "calendar_interval": "day",
            "keyed": true,
            "format": "yyy-MM-dd"
          }
        }
      }
    }
  }
}

它将以以下形式提供结果:

"aggregations" : {
"regulations_over_time" : { //<=== Regulations over time based on regulationDate
  "doc_count" : 9,
  "regulation" : {
    "buckets" : {
      "2020-09-17" : {
        "key_as_string" : "2020-09-17",
        "key" : 1600300800000,
        "doc_count" : 5
      },
      "2020-09-18" : {
        "key_as_string" : "2020-09-18",
        "key" : 1600387200000,
        "doc_count" : 4
      }
    }
  }
},
"news_over_time" : { //<======= news over time based on news date
  "buckets" : {
    "2020-09-17" : {
      "key_as_string" : "2020-09-17",
      "key" : 1600300800000,
      "doc_count" : 2
    },
    "2020-09-18" : {
      "key_as_string" : "2020-09-18",
      "key" : 1600387200000,
      "doc_count" : 2
    }
  }
}
}
}

如果需要的话,你可以合并这两个属性。

utugiqy6

utugiqy62#

我不知道你期望的答复是什么,但如果你想得到每天的新闻数量,这是你的要求寻找

GET /news/_search?size=0
{
  "aggs": {
    "news_over_time": {
      "date_histogram": {
        "field": "regulation.date",
        "calendar_interval": "day",
        "format": "yyy-MM-dd"
         }
      }
   }
}

相关问题