DeepSpeed-MII 无法运行Yi-34B-Chat => ValueError: 不支持的q_ratio: 7

ctrmrzij  于 4个月前  发布在  其他
关注(0)|答案(2)|浏览(86)

你好,DeepSpeed团队!

感谢你们的辛勤工作!

如标题所示,"01-ai/Yi-34B-Chat"模型在使用DeepSpeed-MII版本0.2.3时无法正常运行。遇到的错误信息如下:

[rank0]: Traceback (most recent call last):
 [rank0]: File "/workspaces/deepspeedmiienv/src/mii_serv.py", line 16, in 
 [rank0]: main()
 [rank0]: File "/workspaces/deepspeedmiienv/src/mii_serv.py", line 6, in main
 [rank0]: pipe = mii.pipeline("01-ai/Yi-34B-Chat")
 [rank0]: File "/usr/local/lib/python3.10/dist-packages/mii/api.py", line 207, in pipeline
 [rank0]: inference_engine = load_model(model_config)
 [rank0]: File "/usr/local/lib/python3.10/dist-packages/mii/modeling/models.py", line 17, in load_model
 [rank0]: inference_engine = build_hf_engine(
 [rank0]: File "/usr/local/lib/python3.10/dist-packages/deepspeed/inference/v2/engine_factory.py", line 129, in build_hf_engine
 [rank0]: return InferenceEngineV2(policy, engine_config)
 [rank0]: File "/usr/local/lib/python3.10/dist-packages/deepspeed/inference/v2/engine_v2.py", line 83, in **init**
 [rank0]: self._model = self._policy.build_model(self._config, self._base_mp_group)
 [rank0]: File "/usr/local/lib/python3.10/dist-packages/deepspeed/inference/v2/model_implementations/inference_policy_base.py", line 156, in build_model
 [rank0]: self.model = self.instantiate_model(engine_config, mp_group)
 [rank0]: File "/usr/local/lib/python3.10/dist-packages/deepspeed/inference/v2/model_implementations/llama_v2/policy.py", line 17, in instantiate_model
 [rank0]: return Llama2InferenceModel(config=self._model_config, engine_config=engine_config, base_mp_group=mp_group)
 [rank0]: File "/usr/local/lib/python3.10/dist-packages/deepspeed/inference/v2/model_implementations/inference_transformer_base.py", line 217, in **init**
 [rank0]: self.make_attn_layer()
 [rank0]: File "/usr/local/lib/python3.10/dist-packages/deepspeed/inference/v2/model_implementations/inference_transformer_base.py", line 334, in make_attn_layer
 [rank0]: self.attn = heuristics.instantiate_attention(attn_config, self._engine_config)
 [rank0]: File "/usr/local/lib/python3.10/dist-packages/deepspeed/inference/v2/modules/heuristics.py", line 53, in instantiate_attention
 [rank0]: return DSSelfAttentionRegistry.instantiate_config(config)
 [rank0]: File "/usr/local/lib/python3.10/dist-packages/deepspeed/inference/v2/modules/module_registry.py", line 39, in instantiate_config
 [rank0]: return cls.registry[config_bundle.name](config_bundle.config, config_bundle.implementation_config)
 [rank0]: File "/usr/local/lib/python3.10/dist-packages/deepspeed/inference

相关问题