在多线程测试vllm部署的模型服务时,报错如下:
模型为qwen2-72b-int4-gptq
错误信息:Exception in ASGI application
0|startvllm72b | Traceback (most recent call last):
0|startvllm72b | File "/usr/local/lib/python3.10/dist-packages/sse_starlette/sse.py", line 281, in **call
0|startvllm72b | await wrap(partial(self.listen_for_disconnect, receive))
0|startvllm72b | File "/usr/local/lib/python3.10/dist-packages/sse_starlette/sse.py", line 270, in wrap
0|startvllm72b | await func()
0|startvllm72b | File "/usr/local/lib/python3.10/dist-packages/sse_starlette/sse.py", line 221, in listen_for_disconnect
0|startvllm72b | message = await receive()
0|startvllm72b | File "/usr/local/lib/python3.10/dist-packages/uvicorn/protocols/http/httptools_impl.py", line 580, in receive
0|startvllm72b | await self.message_event.wait()
0|startvllm72b | File "/usr/lib/python3.10/asyncio/locks.py", line 214, in wait
0|startvllm72b | await fut
0|startvllm72b | asyncio.exceptions.CancelledError: Cancelled by cancel scope 7f81f02e3be0
在处理上述异常的过程中,又发生了另一个异常:
0|startvllm72b | Traceback (most recent call last):
0|startvllm72b | File "/usr/local/lib/python3.10/dist-packages/uvicorn/protocols/http/httptools_impl.py", line 419, in run_asgi
0|startvllm72b | result = await app( # type: ignore[func-returns-value]
0|startvllm72b | File "/usr/local/lib/python3.10/dist-packages/uvicorn/middleware/proxy_headers.py", line 84, in call
0|startvllm72b | return await self.app(scope, receive, send)
0|startvllm72b | File "/usr/local/lib/python3.10/dist-packages/fastapi/applications.py", line 1054, in call
0|startvllm72b | await super().call(scope, receive, send)
0|startvllm72b | File "/usr/local/lib/python3.10/dist-packages/starlette/applications.py", line 123, in call
0|startvllm72b | await self.middleware_stack(scope, receive, send)
0|startvllm72b | File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 186, in call
0|startvllm72b | raise exc
0|startvllm72b | File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 164, in call
0|startvllm72b | await self.app(scope, receive, _send)
0|startvllm72b | File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/cors.py", line 83, in call
0|startvllm72b | await self.app(scope, receive, send)
0|startvllm72b | File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 62, in call
0|startvllm72b | await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
0|startvllm72b | File "/usr/local/lib/python3.10/dist-packages/starlette
3条答案
按热度按时间gr8qqesn1#
我也在vllm=0.4.0和Llama-2-13b上遇到了同样的问题。
zazmityj2#
我也是。
mlmc2os53#
0.5.0post1 对于qwen1.5-14b,情况相同,但错误并不总是出现。
这似乎是启用前缀缓存的某些错误。