将Tensorflow权重转移到等效的Pytorch模型

我在Tensorflow中有一个旧的Unet实现，它已经在自定义数据上进行了训练。我已经将权重保存为.hdf5文件格式。现在，我想将我的代码转换为Pytorch，并且我已经在Pytorch中实现了一个等效模型。然而，我在使用新的Pytorch模型中的权重时遇到了挑战。要将Tensorflow权重转换为Pytorch权重，我从tensorflow（逐层）复制权重到pytorch模型的state_dict字典（如代码中所述），并使用这个新字典加载模型。然而，最终的Pytorch模型没有与Tensorflow模型相似的输出（输出是垃圾）。
这里有什么我遗漏的吗？请注意，在每一层中，我必须转置权重，以便与Pytorch格式相似。我认为问题应该在这里。但我不知道如何解决它。任何关于如何解决这个问题的指导也很有帮助

def weight_loading(pretrained_weights):
    # Load the weights
    tf_model = tf.keras.models.load_model(pretrained_weights)
    tf_weights = tf_model.get_weights()

    # Load the PyTorch model
    pt_model = UNet() #implemented based on the previous model (by myself)
    initial_state_dict = pt_model.state_dict()
    new_state_dict = {}
    with torch.no_grad():
        x = 0
        for i, layer in enumerate(pt_model.modules()):
            if isinstance(layer, torch.nn.Conv2d):
                # extract the weights and biases from the TensorFlow weights
                weight_tf = tf_weights[x*2]
                bias_tf = tf_weights[x*2+1]
             
                # convert the weights and biases to PyTorch format
                weight_pt = torch.tensor(weight_tf.transpose())
                bias_pt = torch.tensor(bias_tf)

                # get the name of the weight and bias tensors
                weight_name = list(pt_model.named_parameters())[x*2][0]
                bias_name = list(pt_model.named_parameters())[x*2+1][0]

                # set the weights and biases in the PyTorch model state_dict
                new_state_dict[weight_name]= weight_pt
                new_state_dict[bias_name] = bias_pt

                x = x + 1

            if isinstance(layer, torch.nn.ConvTranspose2d):
                weight_tf = tf_weights[x*2]
                bias_tf = tf_weights[x*2+1]
                
                # convert the weights and biases to PyTorch format
                weight_pt = torch.tensor(np.transpose(weight_tf, (2, 3, 0, 1)))
                bias_pt = torch.tensor(bias_tf)

                # get the name of the weight and bias tensors
                weight_name = list(pt_model.named_parameters())[x*2][0]
                bias_name = list(pt_model.named_parameters())[x*2+1][0]

                # set the weights and biases in the PyTorch model state_dict
                new_state_dict[weight_name] = weight_pt
                new_state_dict[bias_name] = bias_pt

                x = x + 1

    # load the new generated state_dict to pt_model
    pt_model.load_state_dict(new_state_dict)
    return pt_model

在这段代码中，我将权重从Tensorflow模型复制到Pytorch模型（逐层）。每一层都是Cov2d或ConvTranspose2d。我希望当我用转换后的权重加载Pytorch模型并为图像运行它时，我的输出与Tensorflow模型对同一图像的输出类似。但它们并不相同，而且非常不同。
更新：我检查了unet中第一次maxpooling之后（两个conv层之后）两个模型的输出，它们略有不同（与随机启动的pytorch模型的输出相比，这是非常不同的）。

我终于可以解决上面的问题了（几乎）显然，你需要明确地说你想把TensorFlow权重转换为torch floatTensor。（如下所示）所以，我替换了这个：

# convert the weights and biases to PyTorch format
weight_pt = torch.tensor(np.transpose(weight_tf, (2, 3, 0, 1)))
bias_pt = torch.tensor(bias_tf)

# get the name of the weight and bias tensors
weight_name = list(pt_model.named_parameters())[x*2][0]
bias_name = list(pt_model.named_parameters())[x*2+1][0]

# set the weights and biases in the PyTorch model state_dict
new_state_dict[weight_name] = weight_pt
new_state_dict[bias_name] = bias_pt

代码如下：（同时清理代码并删除模型加载）

layer.weight.data = torch.tensor(weight_tf.transpose(2, 3, 0, 1), dtype=torch.float)
layer.bias.data = torch.tensor(bias_tf, dtype=torch.float)

在这之后，我在第一个卷积层之后得到了几乎相同的结果。（两个输出仍然有点不同，但我忽略了它们）
此外，我将所有nn.ConvTranspose2d更改为简单的上采样和卷积（如原始源代码中所实现的）在此之后，我的模型的最终输出与TensorFlow模型足够相似。我认为PyTorch中均匀内核大小的上采样和conv2d的实现与TensorFlow不同，这导致了输出的差异。然而，由于差异如此之小，并没有影响我们的目标，我们忽略了他们

将Tensorflow权重转移到等效的Pytorch模型

1条答案

相关问题

热门标签

最新问答