通过elasticsearch获得独特的结果

knsnq2tg  于 2021-06-15  发布在  ElasticSearch
关注(0)|答案(1)|浏览(303)

我在我的项目中使用带有symfony2的foselasticabundle,mysql数据库中有entry和user表,每个条目都属于一个用户。
我只想从数据库的所有条目中为每个用户获取一个条目。
条目表示法

[
  {
    "id": 1,
    "name": "Hello world",
    "user": {
      "id": 17,
      "username": "foo"
    }
  },
  {
    "id": 2,
    "name": "Lorem ipsum",
    "user": {
      "id": 15,
      "username": "bar"
    }
  },
  {
    "id": 3,
    "name": "Dolar sit amet",
    "user": {
      "id": 17,
      "username": "foo"
    }
  },
]

预期结果是:

[
  {
    "id": 1,
    "name": "Hello world",
    "user": {
      "id": 17,
      "username": "foo"
    }
  },
  {
    "id": 2,
    "name": "Lorem ipsum",
    "user": {
      "id": 15,
      "username": "bar"
    }
  }
]

但它返回表中的所有条目。我尝试向elasticsearch查询添加聚合,但没有任何更改。

$distinctAgg = new \Elastica\Aggregation\Terms("distinctAgg");
$distinctAgg->setField("user.id");
$distinctAgg->setSize(1);

$query->addAggregation($distinctAgg);

有没有什么方法可以通过术语过滤器或其他方法来做到这一点?任何帮助都会很好。谢谢您。

mznpcxlj

mznpcxlj1#

当您习惯于使用mysql group by时,聚合并不容易理解。
首先,聚合结果不会返回到 hits ,但在 aggregations . 因此,当你得到搜索结果时,你必须得到这样的聚合:

$results = $search->search();
$aggregationsResults = $results->getAggregations();

第二件事是聚合不会返回源代码。通过示例的聚合,您将只知道有1个id为15的用户和2个id为15的用户。
e、 g.通过此查询:

{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "byUser": {
      "terms": {
        "field": "user.id"
      }
    }
  }
}

结果:

{
   "took": 1,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 3,
      "max_score": 1,
      "hits": [ ... ]
   },
   "aggregations": {
      "byUser": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
            {
               "key": 17,
               "doc_count": 2
            },
            {
               "key": 15,
               "doc_count": 1
            }
         ]
      }
   }
}

如果您想得到结果,就像在mysql中处理groupby一样,您必须使用 top_hits 子聚合:

{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "byUser": {
      "terms": {
        "field": "user.id"
      },
      "aggs": {
        "results": {
          "top_hits": {
            "size": 1
          }
        }
      }
    }
  }
}

结果:

{
   "took": 3,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 3,
      "max_score": 1,
      "hits": [ ... ]
   },
   "aggregations": {
      "byUser": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
            {
               "key": 17,
               "doc_count": 2,
               "results": {
                  "hits": {
                     "total": 2,
                     "max_score": 1,
                     "hits": [
                        {
                           "_index": "test_stackoverflow",
                           "_type": "test1",
                           "_id": "1",
                           "_score": 1,
                           "_source": {
                              "id": 1,
                              "name": "Hello world",
                              "user": {
                                 "id": 17,
                                 "username": "foo"
                              }
                           }
                        }
                     ]
                  }
               }
            },
            {
               "key": 15,
               "doc_count": 1,
               "results": {
                  "hits": {
                     "total": 1,
                     "max_score": 1,
                     "hits": [
                        {
                           "_index": "test_stackoverflow",
                           "_type": "test1",
                           "_id": "2",
                           "_score": 1,
                           "_source": {
                              "id": 2,
                              "name": "Lorem ipsum",
                              "user": {
                                 "id": 15,
                                 "username": "bar"
                              }
                           }
                        }
                     ]
                  }
               }
            }
         ]
      }
   }
}

有关此页的更多信息:https://www.elastic.co/blog/top-hits-aggregation

相关问题