描述bug
清晰简洁地描述bug是什么。
重现bug
为了帮助我们重现这个bug,请提供以下信息:
- 你的Python版本。
- 你使用的xinference的版本。
- 关键包的版本。
- 错误的完整堆栈信息。
- 最小化重现错误的代码。
2024-02-28 03:45:45,757 xinference.api.restful_api 188628 ERROR Chat completion stream got an error: [address=0.0.0.0:43203, pid=188725] probability tensor contains either `inf`, `nan` or element < 0
Traceback (most recent call last):
File "/new_data2/xuyeqin-data/projects/inference/xinference/api/restful_api.py", line 1257, in stream_results
async for item in iterator:
File "/home/xuyeqin/miniconda3/miniconda/lib/python3.11/site-packages/xoscar/api.py", line 340, in __anext__
return await self._actor_ref.__xoscar_next__(self._uid)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xuyeqin/miniconda3/miniconda/lib/python3.11/site-packages/xoscar/backends/context.py", line 227, in send
return self._process_result_message(result)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xuyeqin/miniconda3/miniconda/lib/python3.11/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
raise message.as_instanceof_cause()
File "/home/xuyeqin/miniconda3/miniconda/lib/python3.11/site-packages/xoscar/backends/pool.py", line 657, in send
result = await self._run_coro(message.message_id, coro)
^^^^^^^^^^^^^^^^^
File "/home/xuyeqin/miniconda3/miniconda/lib/python3.11/site-packages/xoscar/backends/pool.py", line 368, in _run_coro
return await coro
File "/home/xuyeqin/miniconda3/miniconda/lib/python3.11/site-packages/xoscar/api.py", line 384, in __on_receive__
return await super().__on_receive__(message) # type: ignore
^^^^^^^^^^^^^^^^^
File "xoscar/core.pyx", line 558, in __on_receive__
raise ex
File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.__on_receive__
async with self._lock:
^^^^^^^^^^^^^^^^^
File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.__on_receive__
with debug_async_timeout('actor_lock_timeout',
^^^^^^^^^^^^^^^^^
File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive__
result = await result
^^^^^^^^^^^^^^^^^
File "/home/xuyeqin/miniconda3/miniconda/lib/python3.11/site-packages/xoscar/api.py", line 431, in __xoscar_next__
raise e
File "/home/xuyeqin/miniconda3/miniconda/lib/python3.11/site-packages/xoscar/api.py", line 417, in __xoscar_next__
r = await asyncio.to_thread(_wrapper, gen)
^^^^^^^^^^^^^^^^^
File "/home/xuyeqin/miniconda3/miniconda/lib/python3.11/asyncio/threads.py", line 25, in to_thread
return await loop.run_in_executor(None, func_call)
^^^^^^^^^^^^^^^^^
File "/home/xuyeqin/miniconda3/miniconda/lib/python3.11/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^
File "/home/xuyeqin/miniconda3/miniconda/lib/python3.11/site-packages/xoscar/api.py", line 402, in _wrapper
return next(_gen)
File "/new_data2/xuyeqin-data/projects/inference/xinference/core/model.py", line 257, in _to_json_generator
for v in gen:
File "/new_data2/xuyeqin-data/projects/inference/xinference/model/llm/utils.py", line 470, in _to_chat_completion_chunks
for i, chunk in enumerate(chunks):
^^^^^^^^^^^^^^^^^
File "/new_data2/xuyeqin-data/projects/inference/xinference/model/llm/pytorch/core.py", line 253, in generator_wrapper
for completion_chunk, completion_usage in generate_stream(
^^^^^^^^^^^^^^^^^
File "/home/xuyeqin/miniconda3/miniconda/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 35, in generator_context
response = gen.send(None)
^^^^^^^^^^^^^^^^^
File "/new_data2/xuyeqin-data/projects/inference/xinference/model/llm/pytorch/utils.py", line 214, in generate_stream
indices = torch.multinomial(probs, num_samples=2)
^^^^^^^^^^^^^^^^^
RuntimeError: [address=0.0.0.0:43203, pid=188725] probability tensor contains either `inf`, `nan` or element < 0
预期行为
清晰简洁地描述你期望会发生什么。
其他上下文
在这里添加关于问题的其他上下文信息。
2条答案
按热度按时间nwsw7zdq1#
qwen1.5 gptq int8在torch == 2.1.2版本下工作正常,但在torch == 2.2.0版本下出现错误。
rjzwgtxy2#
相似的问题 #733 。