python 如何在Huggingface模型中用另一层相同形状的Tensor替换PyTorch模型层的Tensor？

izj3ouym 于 2023-01-24 发布在 Python

关注(0)|答案(2)|浏览(200)

给定Huggingface模型，例如

from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained("bert-large-uncased", num_labels=2)

我可以这样访问层的Tensor：

# Shape [1024, 1024]
model.state_dict()["bert.encoder.layer.0.attention.self.query.weight"]

[out]：

tensor([[ 0.0167, -0.0422, -0.0425,  ...,  0.0302, -0.0341,  0.0251],
        [ 0.0323,  0.0347, -0.0041,  ..., -0.0722,  0.0031, -0.0351],
        [ 0.0387, -0.0293, -0.0694,  ...,  0.0492,  0.0201, -0.0727],
        ...,
        [ 0.0035,  0.0081, -0.0337,  ...,  0.0460,  0.0268,  0.0747],
        [ 0.0513,  0.0131,  0.0735,  ..., -0.0127,  0.0144, -0.0400],
        [ 0.0385,  0.0013, -0.0272,  ...,  0.0148,  0.0399,  0.0339]])

给定另一个相同形状的Tensor，我已经预先定义好了，在这个例子中，为了说明，我创建了一个随机Tensor，但它可以是任何预先定义的Tensor。

import torch
replacement_layer = torch.rand([1024, 1024])

注意：我不是要用随机Tensor替换层，而是要用预定义的Tensor替换层。

当我尝试通过state_dict()替换层Tensor时，似乎不起作用：

import torch
from transformers import AutoModelForSequenceClassification

# The model with a layer that we want to replace.
model = AutoModelForSequenceClassification.from_pretrained("bert-large-uncased", num_labels=2)

# A replacement layer.
replacement_layer = torch.rand([1024, 1024])

# Replacing the layer in the statedict.
model.state_dict()["bert.encoder.layer.0.attention.self.query.weight"] = replacement_layer

# Check that the layer is replaced. No, it is not =(
assert torch.equal(
    model.state_dict()["bert.encoder.layer.0.attention.self.query.weight"], 
    replacement_layer)

如何在Huggingface模型中将PyTorch模型层的Tensor替换为另一个相同形状的层？

python

来源：https://stackoverflow.com/questions/73635388/how-to-replace-pytorch-model-layers-tensor-with-another-layer-of-same-shape-in

2条答案

按热度按时间

pxyaymoc1#

state_dict是一个特殊的东西，它是一个动态的副本，而不是模型的实际内容，如果这是有意义的话。
您可以通过点符号直接访问模型的层。请注意，0通常表示索引而不是字符串。您还需要将Tensor转换为torch参数，以便在模型中工作。
所以这应该行得通：

model.bert.encoder.layer[0].attention.self.query.weight = torch.nn.Parameter(replacement_layer)

或全文：

# Note I used the base model for testing
import torch
from transformers import AutoModelForSequenceClassification

# The model with a layer that we want to replace.
model: torch.nn.Module = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=2)

# A replacement layer.
replacement_layer = torch.rand([768, 768])

model.bert.encoder.layer[0].attention.self.query.weight = torch.nn.Parameter(replacement_layer)

# Check that the layer is replaced
assert torch.equal(
    model.state_dict()["bert.encoder.layer.0.attention.self.query.weight"],
    replacement_layer)

assert torch.equal(
    model.bert.encoder.layer[0].attention.self.query.weight,
    replacement_layer)

print("Succes!")

赞(0）回复(0）举报 2023-01-24

waxmsbnn2#

更新state_dict（一个有序dict）的副本，然后从更新后的state_dict重新创建模型。

import torch
    from transformers import AutoModelForSequenceClassification

    # The model with a layer that we want to replace.
    model = AutoModelForSequenceClassification.from_pretrained("bert-large-uncased", num_labels=2)

    # A replacement layer.
    replacement_layer = torch.rand([1024, 1024])

    # get a copy of the state_dict
    state_dict_copy = model.state_dict().copy()

    # replace the specific layer with new data of the same shape as the shape of the old layer
    state_dict_copy["bert.encoder.layer.0.attention.self.query.weight"] = replacement_layer

    # re-create the model
    model.load_state_dict(state_dict_copy)

    # Check that the layer is replaced. No, it is not =(
    assert torch.equal( model.state_dict()["bert.encoder.layer.0.attention.self.query.weight"], replacement_layer)

赞(0）回复(0）举报 2023-01-24

我来回答

python 如何在Huggingface模型中用另一层相同形状的Tensor替换PyTorch模型层的Tensor？

2条答案

相关问题

热门标签

最新问答