llama_index [Bug]: 错误:在运行azureaisearch时,使用use_async = True时,异步搜索客户端未初始化,

rjzwgtxy  于 3个月前  发布在  其他
关注(0)|答案(6)|浏览(44)

当我运行以下代码时,出现了错误:

from llama_index.vector_stores.azureaisearch import AzureAISearchVectorStore
vector_store = AzureAISearchVectorStore(
 search_or_index_client=index_client,
 index_name=index_name,
 index_management=IndexManagement.CREATE_IF_NOT_EXISTS,
 id_field_key="id",
 chunk_field_key="chunk",
 embedding_field_key="embedding",
 embedding_dimensionality=EMBED_SIZE,
 metadata_string_field_key="metadata",
 doc_id_field_key="doc_id",
 language_analyzer="en.lucene",
 vector_algorithm_type="exhaustiveKnn",
)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
VectorStoreIndex.from_documents(
 documents,
 storage_context=storage_context,
 show_progress=True,
 use_async=True
)

错误信息如下:

Traceback (most recent call last):
File "/Users/xxx/Projects/xxx/src/index_file.py", line 205, in run_batch
VectorStoreIndex.from_documents(
File "/opt/homebrew/Caskroom/miniforge/base/envs/xxx/lib/python3.11/site-packages/llama_index/core/indices/base.py", line 145, in from_documents
return cls(
^^^^
File "/opt/homebrew/Caskroom/miniforge/base/envs/xxx/lib/python3.11/site-packages/llama_index/core/indices/vector_store/base.py", line 78, in init
super().init(
File "/opt/homebrew/Caskroom/miniforge/base/envs/xxx/lib/python3.11/site-packages/llama_index/core/indices/base.py", line 94, in init
index_struct = self.build_index_from_nodes(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniforge/base/envs/xxx/lib/python3.11/site-packages/llama_index/core/indices/vector_store/base.py", line 314, in build_index_from_nodes
return self._build_index_from_nodes(nodes, **insert_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniforge/base/envs/xxx/lib/python3.11/site-packages/llama_index/core/indices/vector_store/base.py", line 283, in _build_index_from_nodes
run_async_tasks(tasks)
File "/opt/homebrew/Caskroom/miniforge/base/envs/xxx/lib/python3.11/site-packages/llama_index/core/async_utils.py", line 77, in run_async_tasks
outputs: List[Any] = asyncio_run(_gather())
^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniforge/base/envs/xxx/lib/python3.11/site-packages/llama_index/core/async_utils.py", line 33, in asyncio_run
return loop.run_until_complete(coro)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniforge/base/envs/xxx/lib/python3.11/asyncio/base_events.py", line 654, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniforge/base/envs/uambot/lib/python3.11/site-packages/llama_index/core/async_utils.py", line 75, in _gather
return await asyncio.gather(*tasks_to_execute)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniforge/base/envs/xxx/lib/python3.11/site-packages/llama_index/core/indices/vector_store/base.py", line 198, in _async_add_nodes_to_index
new_ids = await self._vector_store.async_add(nodes_batch, **insert_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniforge/base/envs/xxx/lib/python3.11/site-packages/llama_index/vector_stores/azureaisearch/base.py", line 768, in async_add
raise ValueError("Async Search client not initialized")
ValueError: Async Search client not initialized

mitkmikd

mitkmikd1#

要解决在运行 AzureAISearchVectorStoreuse_async=True 的版本 0.10.67.post1 中出现的 "Async Search client not initialized" 错误,请确保提供的 search_or_index_client 是类型为 AsyncSearchClientAsyncSearchIndexClient。此错误发生是因为异步客户端没有正确初始化。
以下是一段代码片段,用于演示如何使用异步客户端初始化 AzureAISearchVectorStore:

from azure.search.documents.aio import SearchClient as AsyncSearchClient
from azure.search.documents.indexes.aio import SearchIndexClient as AsyncSearchIndexClient

Initialize the async search client

async_search_client = AsyncSearchClient(endpoint, index_name, credential)

or

async_index_client = AsyncSearchIndexClient(endpoint, credential)

Create the AzureAISearchVectorStore with the async client

vector_store = AzureAISearchVectorStore(
search_or_index_client=async_search_client, # or async_index_client
id_field_key="id",
chunk_field_key="chunk",
embedding_field_key="embedding",
metadata_string_field_key="metadata",
doc_id_field_key="doc_id",
use_async=True
)


确保将正确的 `search_or_index_client` 作为 `AsyncSearchClient` 或 `AsyncSearchIndexClient` 的示例传递,以避免初始化错误。此外,如果您正在使用 `AsyncSearchIndexClient`,请确保提供 `index_name`,因为它是正确初始化的必要条件。

* 要继续与 [Dosu](https://dosu.dev) 进行对话,请提及 @dosu.*
bcs8qyzn

bcs8qyzn2#

看起来Azure矢量存储仅支持传入同步客户端或异步客户端。
可能应该更新以接受两者?但是,您也可以尝试传入一个异步客户端

fhg3lkii

fhg3lkii3#

你好,@logan-markewich ,感谢你的回复。我尝试在创建问题之前使用AsyncSearchClient,但没有成功。我会回来提供更多的跟踪信息...
有趣的是,它之前是可以工作的,但我忘记在我的环境中添加llama_index版本了。但是代码在不导入.aio的情况下是可以正常工作的,所以我猜一些更新实现了更好的异步使用,但我已经丢失了工作环境...

wlp8pajw

wlp8pajw4#

我已经尝试使用.aio模块,现在它可以正常工作。我想知道是否我的环境设置中有一些小问题,因为第一次尝试运行时没有运行。
然而,我遇到了这个未关闭客户端会话的警告,我不知道如何正确处理:
@dosu
未关闭的客户端会话
client_session: <aiohttp.client.ClientSession object at 0x172c65e50>
未关闭的连接器
connections: ['[(<aiohttp.client_proto.ResponseHandler object at 0x172c10d70>, 160881.12069375)]']
connector: <aiohttp.connector.TCPConnector object at 0x172c88810>
未关闭的客户端会话
client_session: <aiohttp.client.ClientSession object at 0x172c3f410>
未关闭的连接器
connections: ['[(<aiohttp.client_proto.ResponseHandler object at 0x172c103d0>, 160881.764186583)]']
connector: <aiohttp.connector.TCPConnector object at 0x172c3fe10>

oogrdqng

oogrdqng5#

要处理在使用.aio模块与AsyncSearchClient时未关闭的客户端会话和连接器的警告,请确保在使用后正确关闭异步客户端。可以使用async with语句来管理异步客户端的生命周期。
以下是一个示例:

from azure.search.documents.aio import SearchClient as AsyncSearchClient

async def perform_search():
    async with AsyncSearchClient(endpoint, index_name, credential) as client:
        results = await client.search(search_text="example")
        # Process results

通过使用async with语句,当块退出时,客户端将被正确关闭,确保没有未关闭的客户端会话或连接器1(https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/vector_stores/llama-index-vector-stores-azureaisearch/llama_index/vector_stores/azureaisearch/base.py)。

  • 要继续与Dosu交谈,请提及@dosu。*
3phpmpom

3phpmpom6#

要处理在使用.aio模块与AsyncSearchClient时未关闭客户端会话和连接器的警告,请确保在使用后正确关闭异步客户端。可以使用async with语句来管理异步客户端的生命周期。
以下是一个示例:

from azure.search.documents.aio import SearchClient as AsyncSearchClient

async def perform_search():
    async with AsyncSearchClient(endpoint, index_name, credential) as client:
        results = await client.search(search_text="example")
        # Process results

通过使用async with语句,当块退出时,客户端将被正确关闭,确保没有未关闭的客户端会话或连接器。

  • 要继续与Dosu交谈,请提及@dosu。*

@Docu 这是我运行索引所用的代码,请建议我如何更新我的代码以使用您提供的逻辑?

async_index_client = AsyncSearchIndexClient(
        endpoint=search_service_endpoint,
        credential=search_service_credential,
        api_version=search_service_api_version
    )

    vector_store = AzureAISearchVectorStore(
        search_or_index_client=async_index_client,
        index_name=index_name,
        index_management=IndexManagement.CREATE_IF_NOT_EXISTS,
        id_field_key="id",
        chunk_field_key="chunk",
        embedding_field_key="embedding",
        embedding_dimensionality=EMBED_SIZE,
        metadata_string_field_key="metadata",
        doc_id_field_key="doc_id",
        language_analyzer="en.lucene",
        vector_algorithm_type="exhaustiveKnn",
    )

for document in tqdm(documents, total=len(documents), desc="Processing files"):
    try:
        VectorStoreIndex.from_documents(
            [document],
            storage_context=storage_context,
            use_async=True
        )

相关问题