给定Huggingface模型,例如
from transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained("bert-large-uncased", num_labels=2)
我可以这样访问层的Tensor:
# Shape [1024, 1024]
model.state_dict()["bert.encoder.layer.0.attention.self.query.weight"]
[out]:
tensor([[ 0.0167, -0.0422, -0.0425, ..., 0.0302, -0.0341, 0.0251],
[ 0.0323, 0.0347, -0.0041, ..., -0.0722, 0.0031, -0.0351],
[ 0.0387, -0.0293, -0.0694, ..., 0.0492, 0.0201, -0.0727],
...,
[ 0.0035, 0.0081, -0.0337, ..., 0.0460, 0.0268, 0.0747],
[ 0.0513, 0.0131, 0.0735, ..., -0.0127, 0.0144, -0.0400],
[ 0.0385, 0.0013, -0.0272, ..., 0.0148, 0.0399, 0.0339]])
给定另一个相同形状的Tensor,我已经预先定义好了,在这个例子中,为了说明,我创建了一个随机Tensor,但它可以是任何预先定义的Tensor。
import torch
replacement_layer = torch.rand([1024, 1024])
注意:我不是要用随机Tensor替换层,而是要用预定义的Tensor替换层。
当我尝试通过state_dict()
替换层Tensor时,似乎不起作用:
import torch
from transformers import AutoModelForSequenceClassification
# The model with a layer that we want to replace.
model = AutoModelForSequenceClassification.from_pretrained("bert-large-uncased", num_labels=2)
# A replacement layer.
replacement_layer = torch.rand([1024, 1024])
# Replacing the layer in the statedict.
model.state_dict()["bert.encoder.layer.0.attention.self.query.weight"] = replacement_layer
# Check that the layer is replaced. No, it is not =(
assert torch.equal(
model.state_dict()["bert.encoder.layer.0.attention.self.query.weight"],
replacement_layer)
如何在Huggingface模型中将PyTorch模型层的Tensor替换为另一个相同形状的层?
2条答案
按热度按时间pxyaymoc1#
state_dict是一个特殊的东西,它是一个动态的副本,而不是模型的实际内容,如果这是有意义的话。
您可以通过点符号直接访问模型的层。请注意,
0
通常表示索引而不是字符串。您还需要将Tensor转换为torch参数,以便在模型中工作。所以这应该行得通:
或全文:
waxmsbnn2#
更新state_dict(一个有序dict)的副本,然后从更新后的state_dict重新创建模型。