相当于恢复搜索并在elasticsearch中滚动的操作

bxgwgixi  于 2021-06-14  发布在  ElasticSearch
关注(0)|答案(0)|浏览(260)

比如说,我有 1000 documents 在索引中,我可以 fetch 20 docs 立即使用 search and scroll 直到最后一个文件或更早基于 iteration_count 更新的数据 (say 500) 可能会被及时插入到同一索引的基础上,但我希望搜索和滚动文件从那里我上次停止。我遇到了 search_after 但我想它不能和卷轴一起使用。
有没有办法做恢复搜索和滚动?
附言:它不能是一个简单的搜索,必须是滚动查询


# search and scroll in batches of 20

index = "demo"
batch_size = 20 
scroll_interval = "5m"

# to ignore the newer records inserted, if any, after the first search query

count = es.count(index='demo', body={})['count']
iteration_count = count//batch_size 

data = []

result = es.search(
    index=index, 
    body={},
    size=batch_size,
    scroll=scroll_interval)

for hit in result["hits"]["hits"]:
    data.append(hit['_source'])

scroll_id = result['_scroll_id']
scroll_size = result["hits"]["total"]["value"]

i = 0
while((scroll_size > 0) & (i < iteration_count)):

    print("\n\n","Scrolling ({})...".format(i), ", ", scroll_size, ", ", i, ", ", iteration_count)

    result = es.scroll(scroll_id=scroll_id, scroll="5m")
    scroll_id = result["_scroll_id"]
    scroll_size = len(result['hits']['hits'])

    for hit in result["hits"]["hits"]:
        data.append(hit['_source'],ignore_index=True)

    i += 1

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题