我在ElasticSearch索引中有以下结构
{
"_index" : "hotel",
"_type" : "_doc",
"_id" : "13171",
"_score" : 6.072218,
"_source" : {
"_class" : "hotel",
"id" : 13171,
"places" : [
{
"type" : "MAIN_LOCATION",
"placeId" : 2032
}
],
"numberOfRecommendations" : 0
}
},
{
"_index" : "hotel",
"_type" : "_doc",
"_id" : "7146",
"_score" : 6.072218,
"_source" : {
"_class" : "hotel",
"id" : 7146,
"places" : [
{
"type" : "MAIN_LOCATION",
"placeId" : 2032
}
],
"numberOfRecommendations" : 1
}
},
{
"_index" : "hotel",
"_type" : "_doc",
"_id" : "7146",
"_score" : 6.072218,
"_source" : {
"_class" : "hotel",
"id" : 7146,
"places" : [
{
"type" : "AFFILIATE",
"placeId" : 2032
}
],
"numberOfRecommendations" : 3
}
}
请注意,地点是嵌套类型,有两个类型“主位置”和附属。我正在创建一个聚合来计算特定地点的酒店和主要位置的推荐总数。
在上面的主位置示例中,我应该得到hotels 2和numberOfRecommendations 1
我正在使用java并创建了以下代码
public List<PlaceHotelStats> getHotelOfferStats() {
// Create aggregation filter for considering only places with PlaceType from filter(in current
// case main location)
String placeFilterAggregationName = "placeFilter";
BoolQueryBuilder nestedPlaceQuery = boolQuery();
nestedPlaceQuery.must(termQuery("places.type", "MAIN_LOCATION"));
nestedPlaceQuery.must(termsQuery("places.placeId", filter.getPlaceIds()));
AggregationBuilder placeAggregationFilter =
AggregationBuilders.filters(placeFilterAggregationName, nestedPlaceQuery);
// Add Terms filter to group by field placeId and then add sub aggregation for
// totalRecommendations to have buckets
String aggregationGroupByPlaceId = "group_by_place_id";
var includedPlaceIds = filter.getPlaceIds().stream().mapToLong(l -> l).toArray();
TermsAggregationBuilder aggregationBuilders =
AggregationBuilders.terms(aggregationGroupByPlaceId)
.field("places.placeId")
.size(filter.getPlaceIds().size())
.includeExclude(new IncludeExclude(includedPlaceIds, null))
.subAggregation(
AggregationBuilders.sum("totalRecommendationsForPlace")
.field("numberOfRecommendations"));
// Add place term aggregation along with recommendation to Filter aggregation
placeAggregationFilter.subAggregation(aggregationBuilders);
// The final aggregration which has filter first then subaggregation of place terms with buckets
// and review counts
var nestedPlacesAggregation =
AggregationBuilders.nested(NESTED_PLACES_AGGREGATION_NAME, PLACES)
.subAggregation(placeAggregationFilter);
var query =
new NativeSearchQueryBuilder()
.withQuery(builder.query())
.addAggregation(nestedPlacesAggregation)
.build();
var result = elasticsearchOperations.search(query, EsHotel.class, ALIAS_COORDS);
if (!result.hasAggregations()) {
throw new IllegalStateException("No aggregations found after query with aggregations!");
}
ParsedFilters aggregationParsedFilters =
((ParsedNested) result.getAggregations().get(NESTED_PLACES_AGGREGATION_NAME))
.getAggregations()
.get(placeFilterAggregationName);
var buckets =
((ParsedTerms)
aggregationParsedFilters
.getBuckets()
.get(0)
.getAggregations()
.get(aggregationGroupByPlaceId))
.getBuckets();
List<PlaceHotelStats> placeHotelStats= new ArrayList<>();
buckets.forEach(
bucket ->
placeHotelStats.add(
new PlaceHotelStats(
bucket.getKeyAsNumber().longValue(),
Math.toIntExact(bucket.getDocCount()),
getTotalRecommendationsForPlace(bucket))));
return placeOfferStats;
}
private int getTotalRecommendationsForPlace(Terms.Bucket bucket) {
var aggregationTotalRecommendation =
bucket.getAggregations().get("totalRecommendationsForPlace");
if (aggregationTotalRecommendation != null) {
return (int) ((ParsedSum) aggregationTotalRecommendation).getValue();
}
return 0;
}
这给了我正确的总位置数,但不是所有建议的正确总和
我检查ElasticSearch查询,它看起来像这样
{
"query": {
"bool" : {
"must" : [
{
"nested" : {
"query" : {
"bool" : {
"must" : [
{
"term" : {
"places.type" : {
"value" : "MAIN_LOCATION",
"boost" : 1.0
}
}
},
{
"terms" : {
"places.placeId" : [
7146
],
"boost" : 1.0
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
},
"path" : "places",
"ignore_unmapped" : false,
"score_mode" : "min",
"boost" : 1.0
}
},
{
"nested" : {
"query" : {
"exists" : {
"field" : "places",
"boost" : 1.0
}
},
"path" : "places",
"ignore_unmapped" : false,
"score_mode" : "none",
"boost" : 1.0
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
},
"aggs": {
"nestedPlaces":{
"nested":{"path":"places"},
"aggregations":{
"placeFilter":{
"filters":{
"filters":[{
"bool":{
"must":[{
"term":{"places.type":{"value":"MAIN_LOCATION","boost":1.0}}},
{"terms":{"places.placeId":[7146],"boost":1.0}}],
"adjust_pure_negative":true,
"boost":1.0}
}],
"other_bucket":false,
"other_bucket_key":"_other_"},
"aggregations":{
"group_by_place_id":{
"terms":{
"field":"places.placeId",
"size":193,
"min_doc_count":1,
"shard_min_doc_count":0,
"show_term_doc_count_error":false,
"order":[
{"_count":"desc"},
{"_key":"asc"}],
"include":["7146"]},
"aggregations":{
"totalRecommendationsForPlace":{
"sum":{
"field":"numberOfRecommendations"
}
}
}
}
}
}
}
}
}
}
查询的当前输出是totalhotels是正确的,但totalrecommendations是错误的,并且总是0,这意味着子聚合没有按预期工作
"aggregations" : {
"nestedPlaces" : {
"doc_count" : 7,
"placeFilter" : {
"buckets" : [
{
"doc_count" : 3,
"group_by_place_id" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 2032,
"doc_count" : 3,
"totalRecommendationsForPlace" : {
"value" : 0.0
}
}
]
}
}
]
}
}
}
不知道我哪里做错了
1条答案
按热度按时间wsxa1bj11#
你的查询基本上是正确的,直到你试图得到
numberOfRecommendations
的总和。由于该字段位于文档的根级别,而不是嵌套文档本身,因此您需要首先添加reverse_nested
aggregation以返回到顶级文档,然后只有您可以使用sum聚合,如下所示:PS:如果你可以根据类型(主要位置或附属机构)有不同数量的推荐,那么你应该在嵌套级别上有这个数字,你的查询将按原样工作。