llama_index [Bug]:

iyfjxgzm  于 5个月前  发布在  其他
关注(0)|答案(7)|浏览(60)

Bug描述

在FastAPI上运行摄取管道时,出现RuntimeError: no running event loop错误,这些错误来自llamaindex asyncio_run函数(from llama_index.core.async_utils import asyncio_run)。

版本

0.10.41

重现步骤

  1. 在FastAPI上创建一个索引文档端点以索引文档
  2. 使用IngestionPipeline将文档索引到pinecone矢量存储中,使用Transformers如下:
  3. SentenceSplitter
  4. KeywordExtractor
  5. 以包含1000个单词的文档大小并使用aiohttp并发地对索引文档端点进行索引
  6. 错误将频繁出现

相关日志/回溯

Task exception was never retrieved
future: <Task finished name='Task-87799' coro=<AsyncClient.aclose() done, defined at D:\PROJECT\aidoc-llm-service\.venv\Lib\site-packages\httpx\_client.py:2011> exception=RuntimeError('Event loop is closed')>
Traceback (most recent call last):
  File "D:\PROJECT\aidoc-llm-service\.venv\Lib\site-packages\llama_index\core\async_utils.py", line 29, in asyncio_run
    loop = asyncio.get_running_loop()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: no running event loop

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\PROJECT\aidoc-llm-service\.venv\Lib\site-packages\httpx\_client.py", line 2018, in aclose
    await self._transport.aclose()
  File "D:\PROJECT\aidoc-llm-service\.venv\Lib\site-packages\httpx\_transports\default.py", line 385, in aclose
    await self._pool.aclose()
  File "D:\PROJECT\aidoc-llm-service\.venv\Lib\site-packages\httpcore\_async\connection_pool.py", line 313, in aclose
    await self._close_connections(closing_connections)
  File "D:\PROJECT\aidoc-llm-service\.venv\Lib\site-packages\httpcore\_async\connection_pool.py", line 305, in _close_connections
    await connection.aclose()
  File "D:\PROJECT\aidoc-llm-service\.venv\Lib\site-packages\httpcore\_async\connection.py", line 171, in aclose
    await self._connection.aclose()
  File "D:\PROJECT\aidoc-llm-service\.venv\Lib\site-packages\httpcore\_async\http11.py", line 265, in aclose
    await self._network_stream.aclose()
  File "D:\PROJECT\aidoc-llm-service\.venv\Lib\site-packages\httpcore\_backends\anyio.py", line 55, in aclose
    await self._stream.aclose()
  File "D:\PROJECT\aidoc-llm-service\.venv\Lib\site-packages\anyio\streams\tls.py", line 202, in aclose
    await self.transport_stream.aclose()
  File "D:\PROJECT\aidoc-llm-service\.venv\Lib\site-packages\anyio\_backends\_asyncio.py", line 1191, in aclose
    self._transport.close()
  File "C:\Python311\Lib\asyncio\proactor_events.py", line 109, in close
    self._loop.call_soon(self._call_connection_lost, None)
  File "C:\Python311\Lib\asyncio\base_events.py", line 762, in call_soon
    self._check_closed()
  File "C:\Python311\Lib\asyncio\base_events.py", line 520, in _check_closed
    raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
Task exception was never retrieved
future: <Task finished name='Task-87800' coro=<AsyncClient.aclose() done, defined at D:\PROJECT\aidoc-llm-service\.venv\Lib\site-packages\httpx\_client.py:2011> exception=RuntimeError('Event loop is closed')>
Traceback (most recent call last):
  File "D:\PROJECT\aidoc-llm-service\.venv\Lib\site-packages\llama_index\core\async_utils.py", line 29, in asyncio_run
    loop = asyncio.get_running_loop()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: no running event loop

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\PROJECT\aidoc-llm-service\.venv\Lib\site-packages\httpx\_client.py", line 2018, in aclose
    await self._transport.aclose()
  File "D:\PROJECT\aidoc-llm-service\.venv\Lib\site-packages\httpx\_transports\default.py", line 385, in aclose
    await self._pool.aclose()
  File "D:\PROJECT\aidoc-llm-service\.venv\Lib\site-packages\httpcore\_async\connection_pool.py", line 313, in aclose
    await self._close_connections(closing_connections)
  File "D:\PROJECT\aidoc-llm-service\.venv\Lib\site-packages\httpcore\_async\connection_pool.py", line 305, in _close_connections
    await connection.aclose()
  File "D:\PROJECT\aidoc-llm-service\.venv\Lib\site-packages\httpcore\_async\connection.py", line 171, in aclose
    await self._connection.aclose()
  File "D:\PROJECT\aidoc-llm-service\.venv\Lib\site-packages\httpcore\_async\http11.py", line 265, in aclose
    await self._network_stream.aclose()
  File "D:\PROJECT\aidoc-llm-service\.venv\Lib\site-packages\httpcore\_backends\anyio.py", line 55, in aclose
    await self._stream.aclose()
  File "D:\PROJECT\aidoc-llm-service\.venv\Lib\site-packages\anyio\streams\tls.py", line 202, in aclose
    await self.transport_stream.aclose()
  File "D:\PROJECT\aidoc-llm-service\.venv\Lib\site-packages\anyio\_backends\_asyncio.py", line 1191, in aclose
    self._transport.close()
  File "C:\Python311\Lib\asyncio\proactor_events.py", line 109, in close
    self._loop.call_soon(self._call_connection_lost, None)
  File "C:\Python311\Lib\asyncio\base_events.py", line 762, in call_soon
    self._check_closed()
  File "C:\Python311\Lib\asyncio\base_events.py", line 520, in _check_closed
    raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
wwwo4jvm

wwwo4jvm2#

嘿,@upchunk,我在这里帮助你解决任何错误、问题或成为贡献者。让我们一起解决这个问题!
要解决在FastAPI上运行摄取管道时出现的RuntimeError: no running event loop错误,你需要确保正在运行事件循环。以下是如何修改代码以确保事件循环正在运行的示例:

import asyncio
from llama_index.core.ingestion.pipeline import IngestionPipeline
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.embeddings.openai import OpenAIEmbedding

async def run_pipeline(documents):
    pipeline = IngestionPipeline(
        transformations=[
            SentenceSplitter(chunk_size=512, chunk_overlap=20),
            OpenAIEmbedding(),
        ],
    )
    nodes = await pipeline.arun(documents=documents)
    return nodes

# Ensure an event loop is running
documents = [...]  # Your documents here
asyncio.run(run_pipeline(documents))

或者,你可以使用由llamaindex库提供的asyncio_run实用程序函数,它更有效地处理事件循环:

from llama_index.core.async_utils import asyncio_run

async def run_pipeline(documents):
    pipeline = IngestionPipeline(
        transformations=[
            SentenceSplitter(chunk_size=512, chunk_overlap=20),
            OpenAIEmbedding(),
        ],
    )
    nodes = await pipeline.arun(documents=documents)
    return nodes

# Ensure an event loop is running using llamaindex's asyncio_run
documents = [...]  # Your documents here
asyncio_run(run_pipeline(documents))

这些方法应该有助于管理事件循环并解决RuntimeError: no running event loop错误。

  • 要继续与Dosu交谈,请提及@dosu。*
55ooxyrt

55ooxyrt3#

我已经尝试了使用和不使用asyncio_run的方法,问题仍然在两种方法中出现。我认为问题出在asyncio_run辅助函数本身。

nwlls2ji

nwlls2ji4#

如果问题即使在使用asyncio_run助手后仍然存在,可能是因为FastAPI应用程序中的事件循环管理方式。以下是确保正确管理事件循环并避免RuntimeError: no running event loop错误的几个步骤:

  1. 确保事件循环正在运行:
    确保在FastAPI中正确管理事件循环。FastAPI应该为您处理事件循环,但如果您手动运行任务,请确保它们被正确地等待。

  2. 使用nest_asyncio:
    如果您遇到嵌套事件循环,尤其是在像Jupyter笔记本这样的环境中,您可以使用nest_asyncio来允许嵌套异步操作。

  3. 检查事件循环关闭:
    确保事件循环没有被过早关闭。这可能会发生,如果有未处理的异常或者循环被显式关闭。

  4. 示例与FastAPI的集成:
    下面是一个如何将IngestionPipeline与FastAPI集成的示例,确保事件循环得到正确管理:

from fastapi import FastAPI
import asyncio

app = FastAPI()

async def main():
    # Your code here
    pass

@app.get("/")
async def read_root():
    asyncio.run(main())

这个设置确保了事件循环得到正确管理,并且nest_asyncio被应用来处理嵌套异步操作。如果问题仍然存在,可能值得检查您的代码中是否有其他部分可能会关闭事件循环或使其处于无效状态。

hxzsmxv2

hxzsmxv25#

@upchunk奇怪的是,这段代码在一个try/except中,为什么没有被捕获?🙃
无论如何,如果你正在使用fastapi,你应该已经在管道中使用了异步入口点,如果还没有的话。

wgx48brx

wgx48brx6#

我无法复制实际的

from fastapi import FastAPI
from llama_index.core import Document
from llama_index.core.ingestion import IngestionPipeline
from llama_index.core.node_parser import SentenceSplitter
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core.extractors import KeywordExtractor

text = Document.example().text * 1000

app = FastAPI()

@app.get("/")
async def hello():
  return {"Hello": "World"}

@app.get("/test")
async def run_test():
  pipeline = IngestionPipeline(transformations=[
      SentenceSplitter(),
      KeywordExtractor(),
      OpenAIEmbedding(),
    ]
  )
  await pipeline.arun(documents=[Document(text=text), Document(text=text)])
  return "Complete"

if __name__ == "__main__":
  import uvicorn
  uvicorn.run(app)
ig9co6j1

ig9co6j17#

我使用aiohttp向端点发起请求,从30,000个数据源(PDF)中获取数据并存储在云MongoDB服务器上。每个数据包含长度为1000-2000字的文本,通过asio.Semaphore(10)限制以防止LLM速率限制。如果不使用异步,这将总共花费20小时,如果使用异步,所需时间会更长。

相关问题