DeepSpeed-MII 无法加载相对较大的opt模型(opt-6.7b opt-30b)

42fyovps  于 6个月前  发布在  其他
关注(0)|答案(5)|浏览(86)

大家好,我是DeepSpeed MII的新用户,根据pipeline.py在提供的示例中,我已经尝试了几次。

一开始,对于较小的模型,如opt-125m和opt-1.3b,一切正常。然而,当涉及到较大的模型,如opt-6.7b时,加载模型失败。

为了重现这个问题,我们只需使用pipeline加载模型,不做其他操作:

from mii import pipeline
pipe = pipeline("facebook/opt-6.7b")

然后它会打印以下错误信息:

[2023-11-15 02:42:20,499] [INFO] [huggingface_engine.py:86:parameters] Loading checkpoint: /root/.cache/huggingface/hub/models--facebook--opt-6.7b/snapshots/a45aa65bbeb77c1558bc99bedc6779195462dab0/pytorch_model-00001-of-00002.bi
Traceback (most recent call last):
  File "/root/anaconda3/envs/deepspeed/lib/python3.8/site-packages/deepspeed/inference/v2/model_implementations/inference_policy_base.py", line 66, in map_param
    self._non_transformer_params.set_dependency(name, parameter)
  File "/root/anaconda3/envs/deepspeed/lib/python3.8/site-packages/deepspeed/inference/v2/model_implementations/layer_container_base.py", line 283, in set_dependency
    raise ValueError(
ValueError: Could not find a mapping for dependency "decoder.embed_tokens.weight". Check that it is included in the ``MAPPING_PARAMS``. See docstring for more on ``MAPPING_PARAMS``

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "pipeline.py", line 2, in <module>
    pipe = pipeline("facebook/opt-6.7b")
  File "/root/yufan/DeepSpeed-MII/mii/api.py", line 159, in pipeline
    inference_engine = load_model(model_config)
  File "/root/yufan/DeepSpeed-MII/mii/modeling/models.py", line 17, in load_model
    inference_engine = build_hf_engine(
  File "/root/anaconda3/envs/deepspeed/lib/python3.8/site-packages/deepspeed/inference/v2/engine_factory.py", line 46, in build_hf_engine
    return InferenceEngineV2(policy, engine_config)
  File "/root/anaconda3/envs/deepspeed/lib/python3.8/site-packages/deepspeed/inference/v2/engine_v2.py", line 65, in __init__
    self._model = self._policy.build_model(self._config, self._base_mp_group)
  File "/root/anaconda3/envs/deepspeed/lib/python3.8/site-packages/deepspeed/inference/v2/model_implementations/inference_policy_base.py", line 111, in build_model
    self.populate_model_parameters()
  File "/root/anaconda3/envs/deepspeed/lib/python3.8/site-packages/deepspeed/inference/v2/model_implementations/inference_policy_base.py", line 151, in populate_model_parameters
    container_map.map_param(name, parameter)
  File "/root/anaconda3/envs/deepspeed/lib/python3.8/site-packages/deepspeed/inference/v2/model_implementations/inference_policy_base.py", line 71, in map_param
    raise ValueError(f"Cannot find container for {name}, please double check the Containers/ContainerMap")
ValueError: Cannot find container for decoder.embed_tokens.weight, please double check the Containers/ContainerMap

我的环境是基于一个干净的Docker镜像11.8.0-cudnn8-devel-ubuntu22.04构建的,我使用conda为DeepSpeed MII创建了一个完全新的Python 3.8.18环境。然后我用pip install deepspeed-mii安装了DeepSpeed MII。由于问题出现在加载模型时,我认为这与硬件无关。

根据错误信息,我的假设是DeepSpeed MII在加载包含多个bin文件的opt模型时可能存在bug。似乎当只加载一个bin文件时,模型加载器会将模型报告为不完整,忽略剩余的bin文件。

tez616oj

tez616oj1#

@MeloYang05 我能够复现这个错误。看起来某些OPT模型的检查点中的层名称略有不同。例如,在OPT-1.3b中,这个层是model.decoder.embed_tokens.weight ——注意与OPT-6.7b相比,前面多了个model.
我正在与其他DeepSpeed开发者一起寻找支持两者的解决方案。我会在可以的时候分享更新。

x8diyxa7

x8diyxa72#

Hi @MeloYang05,我有一个修复这个错误的方案。现在我们应该支持除了350m模型之外的所有OPT模型大小。这个模型与其他模型有一些不同之处,我们将在未来的PR中解决这些问题。
我正在等待这个PR的单元测试通过:microsoft/DeepSpeed#4694
如果你想在合并之前测试:

pip uninstall deepspeed deepspeed-mii -y
pip install git+https://github.com/microsoft/deepspeed.git@mrwyattii/infv2-fix-OPT
pip install git+https://github.com/microsoft/deepspeed-mii.git
cngwdvgl

cngwdvgl3#

你好,@mrwyattii,感谢你的快速回复!今天我会尝试使用更大的优化模型进行一些基准测试。

56lgkhnf

56lgkhnf4#

你好,@mrwyattii。看起来关于opt-2.7b模型还有一些bug。在我的机器上,加载opt-2.7b模型时报告了以下错误:

Traceback (most recent call last):
  File "pipeline.py", line 32, in <module>
    pipe = pipeline(f"/root/yufan/models/{model_name}")
  File "/root/yufan/DeepSpeed-MII/mii/api.py", line 159, in pipeline
    inference_engine = load_model(model_config)
  File "/root/yufan/DeepSpeed-MII/mii/modeling/models.py", line 17, in load_model
    inference_engine = build_hf_engine(
  File "/root/anaconda3/envs/deepspeed/lib/python3.8/site-packages/deepspeed/inference/v2/engine_factory.py", line 110, in build_hf_engine
    return InferenceEngineV2(policy, engine_config)
  File "/root/anaconda3/envs/deepspeed/lib/python3.8/site-packages/deepspeed/inference/v2/engine_v2.py", line 83, in __init__
    self._model = self._policy.build_model(self._config, self._base_mp_group)
  File "/root/anaconda3/envs/deepspeed/lib/python3.8/site-packages/deepspeed/inference/v2/model_implementations/inference_policy_base.py", line 156, in build_model
    self.model = self.instantiate_model(engine_config, mp_group)
  File "/root/anaconda3/envs/deepspeed/lib/python3.8/site-packages/deepspeed/inference/v2/model_implementations/opt/policy.py", line 17, in instantiate_model
    return OPTInferenceModel(config=self._model_config, engine_config=engine_config, base_mp_group=mp_group)
  File "/root/anaconda3/envs/deepspeed/lib/python3.8/site-packages/deepspeed/inference/v2/model_implementations/inference_transformer_base.py", line 208, in __init__
    self.make_attn_layer()
  File "/root/anaconda3/envs/deepspeed/lib/python3.8/site-packages/deepspeed/inference/v2/model_implementations/inference_transformer_base.py", line 324, in make_attn_layer
    self.attn = heuristics.instantiate_attention(attn_config, self._engine_config)
  File "/root/anaconda3/envs/deepspeed/lib/python3.8/site-packages/deepspeed/inference/v2/modules/heuristics.py", line 53, in instantiate_attention
    return DSSelfAttentionRegistry.instantiate_config(config)
  File "/root/anaconda3/envs/deepspeed/lib/python3.8/site-packages/deepspeed/inference/v2/modules/module_registry.py", line 39, in instantiate_config
    return cls.registry[config_bundle.name](config_bundle.config, config_bundle.implementation_config)
  File "/root/anaconda3/envs/deepspeed/lib/python3.8/site-packages/deepspeed/inference/v2/modules/implementations/attention/dense_blocked_attention.py", line 79, in __init__
    self._kv_copy = LinearBlockedKVCopy(self._config.head_size, self._config.n_heads_q,
  File "/root/anaconda3/envs/deepspeed/lib/python3.8/site-packages/deepspeed/inference/v2/kernels/ragged_ops/linear_blocked_kv_rotary/linear_blocked_kv_copy.py", line 39, in __init__
    raise ValueError("Unsupported head size: {}, supported_head_sizes are {}".format(
ValueError: Unsupported head size: 80, supported_head_sizes are [64, 128]
dba5bblo

dba5bblo5#

@MeloYang05 - 您说得对,2.7b模型的设备也存在问题。我没有针对这个模型进行测试。我还注意到,我们目前不支持350m型号。我会尽快通过另一篇PR来支持这两种尺寸的变体。感谢您的耐心等待。

相关问题