我有一些文档具有以下结构(对于示例来说非常简单):
"documents": [
{
"name": "Document 1",
"collections" : [
{
"id": 30,
"title" : "Research"
},
{
"id": 45,
"title" : "Events"
},
{
"id" : 52,
"title" : "International"
}
]
},
{
"name": "Document 2",
"collections" : [
{
"id": 45,
"title" : "Events"
},
{
"id" : 63,
"title" : "Development"
}
]
}
]
我需要集合的聚合。当我这样做时,效果很好:
"aggs": {
"collections": {
"terms": {
"field": "collections.title",
"size": 30
}
}
}
我得到了一个很好的结果,正如预期的那样:
"aggregations" : {
"collections" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "Research",
"doc_count" : 18
},
{
"key" : "Events",
"doc_count" : 14
},
{
"key" : "International",
"doc_count" : 13
},
{
"key" : "Development",
"doc_count" : 8
}
]
}
}
不过,我也想把身份证包括在内。所以我试了一下:
"aggs": {
"collections": {
"terms": {
"field": "collections.title",
"size": 30
}
},
"aggs": {
"id": {
"terms": {
"field": "collections.id",
"size": 1
}
}
}
}
这就是结果:
"aggregations" : {
"collections" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "Research",
"doc_count" : 18,
"id" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "30",
"doc_count" : 1
}
]
}
},
{
"key" : "Events",
"doc_count" : 14,
"id" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "45",
"doc_count" : 1
}
]
}
},
{
"key" : "International",
"doc_count" : 13,
"id" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "52",
"doc_count" : 1
}
]
}
},
{
"key" : "Development",
"doc_count" : 8,
"id" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "45",
"doc_count" : 1
}
]
}
}
]
}
}
乍一看,它看起来不错。但仔细看,它的最后一个元素与发展(向下滚动)。id应该是63,但却是45。我不清楚为什么会这样,但我找不到解决方法。我也尝试了multi_terms,但它给出了类似的结果。我认为这个问题与文档中有多个集合的事实有关。有人知道解决这个问题的正确方法吗?
1条答案
按热度按时间hsgswve41#
原因是在一个对象类型Map中,“title”和“id”之间没有关系,所有的东西都被Elasticsearch隐藏起来了,所以:
变成:
Elasticsearch不知道id 30属于研究,或者id 45属于事件。
必须使用“nested”类型来保持嵌套属性之间的关系。https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html
解决方案:使用嵌套字段类型
Map
文件
查询
结果
你可以阅读我写的一篇文章了解详情:
https://opster.com/guides/elasticsearch/data-architecture/elasticsearch-nested-field-object-field/