elasticsearch 使用ES update_by_query时,如何确定是否正确更新了所有文档

czq61nw1  于 2023-08-03  发布在  ElasticSearch
关注(0)|答案(1)|浏览(184)

http://localhost:9200/user/_update_by_query?冲突=继续

{
    "script": {
        "source": "ctx._source.age = params.age",
        "lang": "painless",
        "params": {
            "age": 10
        }
    },
    "size": 1000,
    "query": {
        "term": {
            "age": {
                "value": 20,
                "boost": 1.0
            }
        }
    }
}

字符串
回复:

{
    "took": 28,
    "timed_out": false,
    "total": 1,
    "updated": 1,
    "deleted": 0,
    "batches": 1,
    "version_conflicts": 0,
    "noops": 0,
    "retries": {
        "bulk": 0,
        "search": 0
    },
    "throttled_millis": 0,
    "requests_per_second": -1.0,
    "throttled_until_millis": 0,
    "failures": []
}

  • total:成功处理的文档数。
  • updated:成功更新的文档数。

我不知道什么是真正的意义上的全部财产。如果是查询的计数,我可以检查total是否等于updated**。
//解决方案一

if (response.getTotal() == response.getUpdated()) {
            //all updated
        } else {
            //some documents failed, retry latter
        }


//解决方案二

if (response.getVersionConflicts() == 0 && CollectionUtils.isEmpty(response.getFailures())) {
            //all updated
        } else {
            //some documents failed, retry latter
        }

eqzww0vc

eqzww0vc1#

编辑:

如果您没有在_update_by_query请求头中指定wait_for_completion=false,则可以通过以下API调用找到task_id

GET .tasks/_search?size=1000

字符串
当你运行_update_by_query API时,Elasticsearch将创建一个task_id。如果保存task_id,则可以使用以下命令检查状态:

GET /_tasks/r1A2WoRbTwKZ516z6NEs5A:36619


如果您没有保存task_id,您可以使用_tasks API进行检查,并找到您的task_id,如果通过查询更新仍然运行。

curl -X GET "localhost:9200/_tasks?detailed=true&actions=*byquery&pretty"


https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-update-by-query.html#docs-update-by-query-fetch-tasks

使用wait_for_completion=false获取task_id作为输出。

举例如下:

PUT test
POST test/_update_by_query?conflicts=proceed&wait_for_completion=false
#response:
{
  "task": "dwXMZTfZQYuGH6FLEIxcZQ:365044451"
}

GET _tasks/dwXMZTfZQYuGH6FLEIxcZQ:365044451
#response
{
  "completed": true,
  "task": {
    "node": "dwXMZTfZQYuGH6FLEIxcZQ",
    "id": 365044451,
    "type": "transport",
    "action": "indices:data/write/update/byquery",
    "status": {
      "total": 21,
      "updated": 21,
      "created": 0,
      "deleted": 0,
      "batches": 1,
      "version_conflicts": 0,
      "noops": 0,
      "retries": {
        "bulk": 0,
        "search": 0
      },
      "throttled_millis": 0,
      "requests_per_second": -1,
      "throttled_until_millis": 0
    },
    "description": "update-by-query [test]",
    "start_time_in_millis": 1689251128726,
    "running_time_in_nanos": 6638281,
    "cancellable": true,
    "cancelled": false,
    "headers": {
      "trace.id": "1697d4a40bdfadc1494a10adcb994d95"
    }
  },
  "response": {
    "took": 6,
    "timed_out": false,
    "total": 21,
    "updated": 21,
    "created": 0,
    "deleted": 0,
    "batches": 1,
    "version_conflicts": 0,
    "noops": 0,
    "retries": {
      "bulk": 0,
      "search": 0
    },
    "throttled": "0s",
    "throttled_millis": 0,
    "requests_per_second": -1,
    "throttled_until": "0s",
    "throttled_until_millis": 0,
    "failures": []
  }
}

  1. completed显示_update_by_query是否完成。
  2. tasks.status.total显示总文档数
  3. tasks.status.updated显示更新的文档数
    注意:如果启用了slice,则在此过程完成之前无法看到count。获取带切片的请求的任务的状态仅包含已完成切片的状态。

相关问题