Opensearch/Elasticsearch排序，两个参数的权重/优先级相等

我试图了解如何在Opensearch中解决这个问题（但Elasticsearch解决方案可以）。
本质上，我有一个工作索引，我试图根据两个参数对它们进行排序，每个参数的权重相同：订阅层和受欢迎度分数（每个是每个工作文档中的字段）。
通常情况下，当你排序时，你首先根据一个排序，然后是另一个，本质上我需要混合它们，并给予每个50/50的权重。
当工作按相关性排序（默认）时，我们希望这是其订阅层和工作个体相关性得分的组合，根据权重w，例如以下公式：
工作将根据加权得分进行排名。
加权分数=（r1 x w）+（r2 x（1-w），其中：
r1 =在只考虑相关性的情况下，职位在给定搜索中的排名;以及r2 =在仅考虑订阅的情况下针对给定搜索的作业排名的位置
然而，问题是我需要执行多次搜索来获得每个作业的每个排序标准的排名，这将是非常低效的。我试图看看我是否可以用Opensearch解决这个问题。
例如，我试图将其计算为脚本得分函数，纯粹使用两个字段，但它们完全不相关，并且在其间没有归一化，因此分配相等的权重变得具有挑战性。
以下是我目前为止所做的尝试。首先添加一些测试文档：

POST _bulk
{"index":{"_index":"tier-sort","_id":"1"}}
{"title":"Job 1","popularity_score":"0.105","bid":"100"}
{"index":{"_index":"tier-sort","_id":"2"}}
{"title":"Job 2","popularity_score":"0.06","bid":"50"}
{"index":{"_index":"tier-sort","_id":"3"}}
{"title":"Job 3","popularity_score":"0.099","bid":"25"}
{"index":{"_index":"tier-sort","_id":"4"}}
{"title":"Job 4","popularity_score":"0.155","bid":"5"}
{"index":{"_index":"tier-sort","_id":"5"}}
{"title":"Job 5","popularity_score":"0.028","bid":"100"}
{"index":{"_index":"tier-sort","_id":"6"}}
{"title":"Job 6","popularity_score":"0.118","bid":"100"}
{"index":{"_index":"tier-sort","_id":"7"}}
{"title":"Job 7","popularity_score":"0.186","bid":"50"}
{"index":{"_index":"tier-sort","_id":"8"}}
{"title":"Job 8","popularity_score":"0.019","bid":"25"}
{"index":{"_index":"tier-sort","_id":"9"}}
{"title":"Job 9","popularity_score":"0.081","bid":"5"}
{"index":{"_index":"tier-sort","_id":"10"}}
{"title":"Job 10","popularity_score":"0.124","bid":"100"}
{"index":{"_index":"tier-sort","_id":"11"}}
{"title":"Job 11","popularity_score":"0.163","bid":"100"}
{"index":{"_index":"tier-sort","_id":"12"}}
{"title":"Job 12","popularity_score":"0.025","bid":"50"}
{"index":{"_index":"tier-sort","_id":"13"}}
{"title":"Job 13","popularity_score":"0.16","bid":"25"}
{"index":{"_index":"tier-sort","_id":"14"}}
{"title":"Job 14","popularity_score":"0.119","bid":"5"}
{"index":{"_index":"tier-sort","_id":"15"}}
{"title":"Job 15","popularity_score":"0.16","bid":"100"}

然后，我尝试使用脚本得分，以便每个因素对排序贡献一半：

GET tier-sort/_search
{
  "size": 100,
  "query": {
    
    "function_score": {
      "query": {
        "match_all": {}
      },
      "functions": [
        {
          "script_score": {
            "script": "doc['popularity_score'].value"
          },
        },
        {
          "script_score": {
            "script": "doc['bid'].value"
          },
        }
      ]
    }
  }
}

然而，问题是标准化。出价和人气是完全不同的尺度。如何在Elasticsearch中实现这一点？有没有一种方法可以在本地实现这一点？
先谢了！

有2种方法可以更改Elasticsearch/Opensearch搜索结果的排名
1.增加boosting逻辑（如script_score、function score、rank features），更改最终_score

Sort在某个字段上，或者指定排序逻辑，默认情况下ES会在_score上排序，但如果指定的不是_score的排序逻辑，boosting逻辑会被忽略，_score会被置为空，只有排序部分生效
如果你有两个因素在不同的尺度，那么rank_features可以帮助你有效地归一化，例如。
添加一些文档

POST _bulk
{"index":{"_index":"tier-sort","_id":"1"}}
{"title":"Job 1","rank":{"popularity_score":0.105,"bid":100}}
{"index":{"_index":"tier-sort","_id":"2"}}
{"title":"Job 2","rank":{"popularity_score":0.06,"bid":50}}
{"index":{"_index":"tier-sort","_id":"3"}}
{"title":"Job 3","rank":{"popularity_score":0.099,"bid":25}}

在查询中应用rank_feature

GET tier-sort/_search
{
  "size": 100,
  "query": {
    "bool": {
       "should": [
         {
           "rank_feature": {
             "field": "rank.popularity_score",
             "saturation": {},
             "boost": 0.5
           }
         },
         {
           "rank_feature": {
             "field": "rank.bid",
             "saturation": {},
             "boost": 0.5
           }
         }
       ]
    }
  }
}

您可以在排名功能中选择不同的内置函数，调整pivot来控制结果，也可以使用explain api来详细了解分数的计算方式，这可以帮助您检查查询是否按预期运行

Opensearch/Elasticsearch排序，两个参数的权重/优先级相等

1条答案

相关问题

热门标签

最新问答