vllm 致命的Python错误:段错误(Segmentation fault)

yfjy0ee7  于 2个月前  发布在  Python
关注(0)|答案(1)|浏览(85)

收到了一个SIGSEGV信号,发生在时间1709021359,位于CPU 44上。

PC: @ 0x7f3c5f628350 (未知) (未知)
@ 0x7f3c945c8630 (未知) (未知)
@ 0x55b11321eaf0 1247139872 (未知)
... 以及至少2个其他帧
[2024-02-27 16:09:19,106 E 19394 19394] logging.cc:361: *** SIGSEGV received at time=1709021359 on cpu 44 ***
[2024-02-27 16:09:19,108 E 19394 19394] logging.cc:361: PC: @ 0x7f3c5f628350 (unknown) (unknown)
[2024-02-27 16:09:19,108 E 19394 19394] logging.cc:361: @ 0x7f3c945c8630 (unknown) (unknown)
[2024-02-27 16:09:19,110 E 19394 19394] logging.cc:361: @ 0x55b11321eaf0 1247139872 (unknown)
[2024-02-27 16:09:19,110 E 19394 19394] logging.cc:361: @ ... and at least 2 more frames
致命的Python错误:段错误
堆栈(最近的调用在前):
文件 "/root/.conda/envs/py39/lib/python3.9/site-packages/torch/cuda/graphs.py",第77行 in capture_begin
文件 "/root/.conda/envs/py39/lib/python3.9/site-packages/torch/cuda/graphs.py",第192行 in enter
文件 "/root/.conda/envs/py39/lib/python3.9/site-packages/vllm/worker/model_runner.py",第782行 in capture
文件 "/root/.conda/envs/py39/lib/python3.9/site-packages/vllm/worker/model_runner.py",第725行 in capture_model
文件 "/root/.conda/envs/py39/lib/python3.9/site-packages/torch/utils/_contextlib.py",第115行 in decorate_context
文件 "/root/.conda/envs/py39/lib/python3.9/site-packages/vllm/worker/worker.py",第160行 in warm_up_model
文件 "/root/.conda/envs/py39/lib/python3.9/site-packages/vllm/engine/llm_engine.py",第1006行 in _run_workers
文件 "/root/.conda/envs/py39/lib/python3.9/site-packages/vllm/engine/llm_engine.py",第360行 in _init_cache
文件 "

相关问题