elasticsearch 使用OpenSearch Python批量API将数据插入多个索引

r8uurelv 于 2022-11-22 发布在 ElasticSearch

关注(0)|答案(1)|浏览(543)

本文档展示了如何在curl中使用POST请求插入具有多个索引的批量数据：https://opensearch.org/docs/latest/opensearch/index-data/
如果我有这种格式的数据，

[
{ "index": { "_index": "index-2022-06-08", "_id": "<id>" } }
{ "A JSON": "document" }
{ "index": { "_index": "index-2022-06-09", "_id": "<id>" } }
{ "A JSON": "document" }
{ "index": { "_index": "index-2022-06-10", "_id": "<id>" } }
{ "A JSON": "document" }
]

批量请求应采用"_index": "index-2022-06-08"中的索引名称
我尝试使用OpenSearch-py库来做同样的事情，但是我找不到任何这样的例子片段。我使用这种格式从AWS Lambda发送请求。

client = OpenSearch(
            hosts = [{'host': host, 'port': 443}],
            http_auth = awsauth,
            use_ssl = True,
            verify_certs = True,
            connection_class = RequestsHttpConnection
            )
        
        resp = helpers.bulk(client, logs, index= index_name, max_retries = 3)

这里，我必须在批量请求中使用index_name作为参数，这样它就不会从数据本身获取index_name。如果我不在参数中使用index_name，我会得到错误4xx index_name missing。
我也在研究批量API源代码：https://github.com/opensearch-project/opensearch-py/blob/main/opensearchpy/helpers/actions.py#L373
index_name似乎不是必需参数。
有谁能帮我解决我遗漏的问题吗？

elasticsearch

来源：https://stackoverflow.com/questions/72632710/using-opensearch-python-bulk-api-to-insert-data-to-multiple-indices

1条答案

按热度按时间

kyxcudwk1#

我遇到了同样的问题，并在www.example.com的bulk-helpers文档中找到了解决方案elasticsearch.py。
调用批量方法：

resp = helpers.bulk(
    self.opensearch,
    actions,
    max_retries=3,
)

其中actions是字典列表，如下所示：

[{
    '_op_type': 'update',
    '_index': 'index-name',
    '_id': 42,
    '_source': {
        "title": "Hello World!",
        "body": "..."
    }
}]

_op_type可用作附加字段，以定义应为文档调用的操作（index、update、delete...）。
希望这对遇到同样问题的人有帮助！

赞(0）回复(0）举报 2022-11-22

我来回答

elasticsearch 使用OpenSearch Python批量API将数据插入多个索引

1条答案

相关问题

热门标签

最新问答