如何在ElasticSearch中按组“row_number”排序

ct3nt3jp  于 2022-11-22  发布在  ElasticSearch
关注(0)|答案(2)|浏览(256)

我有一个包含产品的数据库,每个产品都有ID、名称、制造商ID、类别ID和用户分数。
我想检索所有的产品由一个给定的类别排序的用户分数,但避免许多产品的同一制造商列在一起。
通过以下查询,它们都粘在一起:

SELECT
P.ProductId, P.Name, P.ManufacturerId, P.UserScore
FROM Products P
WHERE P.CategoryId = 1
ORDER BY P.UserScore

这是T-SQL中的结果

在T-SQL中,我提出了如下解决方案,其中产品按制造商分组为不超过2个元素,它非常适合我的需要:

SELECT T.*
FROM (
       SELECT
             P.ProductId, P.Name, P.ManufacturerId, P.UserScore,
             ROW_NUMBER() OVER (PARTITION BY P.ManufacturerId ORDER BY P.UserScore DESC) RN
       FROM Products P
       WHERE P.CategoryId = 1
) T
ORDER BY T.UserScore / CEILING(RN/2.0) DESC

我如何实现一个ElasticSearch查询来模拟这种行为呢?
有什么想法吗?
elasticsearch中的索引应该是这样的,这只是一个抽象的例子:

{"ProductId": "157072", "Name": "Product 157072", "ManufacturerId": "7790", "UserScore": "100000", "CategoryId": "1"},
{"ProductId": "296881", "Name": "Product 296881", "ManufacturerId": "6921", "UserScore": "35400", "CategoryId": "1"},
{"ProductId": "353924", "Name": "Product 353924", "ManufacturerId": "54616", "UserScore": "25000", "CategoryId": "1"},
...
d4so4syb

d4so4syb1#

您可以使用折叠搜索功能对所有制造商进行分组:
https://www.elastic.co/guide/en/elasticsearch/reference/current/collapse-search-results.html
访问“inner_hits”以控制折叠的结果行为。

# Indexing Documents
POST test_so/_bulk
{ "index" : {} }
{"ProductId": "157072", "Name": "Product 157072", "ManufacturerId": "7790", "UserScore": 100000, "CategoryId": "1"}
{ "index" : {} }
{"ProductId": "296881", "Name": "Product 296881", "ManufacturerId": "6921", "UserScore": 35400, "CategoryId": "1"}
{ "index" : {} }
{"ProductId": "353924", "Name": "Product 353924", "ManufacturerId": "54616", "UserScore": 25000, "CategoryId": "1"}

# Filtering by Category: 1, collapsing by Manufacturer and sorting by UserScore
POST test_so/_search
{
  "query": {
    "term": {
      "CategoryId.keyword": {
        "value": "1"
      }
    }
  },
  "collapse": {
    "field": "ManufacturerId.keyword"
  }, 
  "sort": [
    {
      "UserScore": {
        "order": "desc"
      }
    }
  ]
}

结果

{
  "took": 22,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 3,
      "relation": "eq"
    },
    "max_score": null,
    "hits": [
      {
        "_index": "test_so",
        "_id": "0amBPYQBJRm5qR4vd6NE",
        "_score": null,
        "_source": {
          "ProductId": "157072",
          "Name": "Product 157072",
          "ManufacturerId": "7790",
          "UserScore": 100000,
          "CategoryId": "1"
        },
        "fields": {
          "ManufacturerId.keyword": [
            "7790"
          ]
        },
        "sort": [
          100000
        ]
      },
      {
        "_index": "test_so",
        "_id": "0qmBPYQBJRm5qR4vd6NE",
        "_score": null,
        "_source": {
          "ProductId": "296881",
          "Name": "Product 296881",
          "ManufacturerId": "6921",
          "UserScore": 35400,
          "CategoryId": "1"
        },
        "fields": {
          "ManufacturerId.keyword": [
            "6921"
          ]
        },
        "sort": [
          35400
        ]
      },
      {
        "_index": "test_so",
        "_id": "06mBPYQBJRm5qR4vd6NE",
        "_score": null,
        "_source": {
          "ProductId": "353924",
          "Name": "Product 353924",
          "ManufacturerId": "54616",
          "UserScore": 25000,
          "CategoryId": "1"
        },
        "fields": {
          "ManufacturerId.keyword": [
            "54616"
          ]
        },
        "sort": [
          25000
        ]
      }
    ]
  }
}
cvxl0en2

cvxl0en22#

假设Group中的所有项都具有相同的值,请尝试执行以下操作。因此,我使用First()

var results = products.Where(x => x.CategoryId == 1)
      .OrderByDescending(x => x.UserScore)
      .GroupBy(x => x.ManufacturerId)
      .Select(x => new {ProductId = x.ProductId.First(), Name = x.Name.First(), ManufacturerId = x.Key, UserScore = x.UserScore.First()})

相关问题