我将一个8x7b的模型与lora适配器合并,并使用torch.save(model.state_dict(), 'path_to_model.pt')保存。然而,当我在新的合并模型上使用vllm进行推理时,我遇到了这个问题:
File "/home/zhh/miniconda3/envs/vllm/lib/python3.9/site-packages/vllm/entrypoints/llm.py", line 93, in __init__
self.llm_engine = LLMEngine.from_engine_args(engine_args)
File "/home/zhh/miniconda3/envs/vllm/lib/python3.9/site-packages/vllm/engine/llm_engine.py", line 246, in from_engine_args
engine = cls(*engine_configs,
File "/home/zhh/miniconda3/envs/vllm/lib/python3.9/site-packages/vllm/engine/llm_engine.py", line 107, in __init__
self._init_workers_ray(placement_group)
File "/home/zhh/miniconda3/envs/vllm/lib/python3.9/site-packages/vllm/engine/llm_engine.py", line 194, in _init_workers_ray
self._run_workers(
File "/home/zhh/miniconda3/envs/vllm/lib/python3.9/site-packages/vllm/engine/llm_engine.py", line 750, in _run_workers
self._run_workers_in_batch(workers, method, *args, **kwargs))
File "/home/zhh/miniconda3/envs/vllm/lib/python3.9/site-packages/vllm/engine/llm_engine.py", line 727, in _run_workers_in_batch
all_outputs = ray.get(all_outputs)
File "/home/zhh/miniconda3/envs/vllm/lib/python3.9/site-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper
return fn(*args, **kwargs)
File "/home/zhh/miniconda3/envs/vllm/lib/python3.9/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
return func(*args, **kwargs)
File "/home/zhh/miniconda3/envs/vllm/lib/python3.9/site-packages/ray/_private/worker.py", line 2624, in get
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(KeyError): ray::RayWorkerVllm.execute_method() (pid=2596933, ip=192.254.110.7, actor_id=afac0d35c8217a762419a5cc01000000, repr=<vllm.engine.ray_utils.RayWorkerVllm object at 0x7efd70ee22e0>)
File "/home/zhh/miniconda3/envs/vllm/lib/python3.9/site-packages/vllm/engine/ray_utils.py", line 32, in execute_method
return executor(*args, **kwargs)
File "/home/zhh/miniconda3/envs/vllm/lib/python3.9/site-packages/vllm/worker/worker.py", line 72, in load_model
self.model_runner.load_model()
File "/home/zhh/miniconda3/envs/vllm/lib/python3.9/site-packages/vllm/worker/model_runner.py", line 36, in load_model
self.model = get_model(self.model_config)
File "/home/zhh/miniconda3/envs/vllm/lib/python3.9/site-packages/vllm/model_executor/model_loader.py", line 124, in get_model
model.load_weights(model_config.model, model_config.download_dir,
File "/home/zhh/miniconda3/envs/vllm/lib/python3.9/site-packages/vllm/model_executor/models/mixtral.py", line 525, in load_weights
param = params_dict[name]
KeyError: 'model.embed_tokens.weight'
5条答案
按热度按时间cig3rfwq1#
相同的
xxls0lw82#
我也有同样的问题,我们有什么解决办法吗?
cidc1ykv3#
请提供一段代码,以便我们了解发生了什么。
b09cbbtk4#
dgsult0t5#
我正在使用来自ghcr.io/mistralai/mistral-src/vllm:latest的镜像,这是2个挂载点旧的,我将其更改为vllm/vllm-openai:latest镜像,它可以与safetensors文件一起工作。