keras 从图像中提取特征嵌入

我正在尝试使用TensorFlow.js从图像中提取特征嵌入。
在其他地方，我使用PyTorch和ResNet152来提取特征嵌入，效果良好。
下面是我如何提取这些特征嵌入的示例。

import torch
import torchvision.models as models
from torchvision import transforms
from PIL import Image

# Load the model
resnet152_torch = models.resnet152(pretrained=True)

# Enumerate all of the layers of the model, except the last layer. This should leave
# the average pooling layer. 
layers = list(resnet152_torch.children())[:-1]

resnet152 = torch.nn.Sequential(*(list(resnet152_torch.children())[:-1]))

# Set to evaluation model. 
resnet152_torch.eval()

# Load and preprocess the image, it's already 224x224
image_path = "test.png" 
img = Image.open(image_path).convert("RGB")

# Define the image transformation
preprocess = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

# Apply the preprocessing steps to the image
img_tensor = preprocess(img).unsqueeze(0)

with torch.no_grad():
    # Get the image features from the ResNet-152 model
    img_features = resnet152(img_tensor)

print(img_features.squeeze())

字符串
本质上，我使用预先训练好的模型，并删除最后一层来获得我的特征嵌入。
上面脚本的结果是：

tensor([0.2098, 0.4687, 0.0914,  ..., 0.0309, 0.0919, 0.0480])

型
所以现在，我想用TensorFlow.js做类似的事情。
我需要的第一件事是一个可以与TensorFlow.js一起使用的ResNet152模型的示例。因此，我创建了以下Python脚本来将ResNet152导出为Keras格式。

from tensorflow.keras.applications import ResNet152
from tensorflow.keras.models import save_model

# Load the pre-trained ResNet-152 model without the top (fully connected) layer
resnet152 = ResNet152(weights='imagenet')

# Set the model to evaluation mode
resnet152.trainable = False

# Save the ResNet-152 model
save_model(resnet152, "resnet152.h5")

型
然后我使用“tensorflowjs_converter”实用程序将Keras（.h5）模型导出为TensorFlow.js格式.

tensorflowjs_converter --input_format keras resnet152.h5 resnet152

型
一旦我有了适当格式的模型（我想），我就切换到JavaScript。

import * as tf from '@tensorflow/tfjs-node';
import fs from 'fs';

async function main() {
    const model = await tf.loadLayersModel('file://resnet152/model.json');

    const modelWithoutFinalLayer = tf.model({
        inputs: model.input,
        outputs: model.getLayer('avg_pool').output
    });

    // Load the image from disk
    const image = fs.readFileSync('example_images/test.png'); // This is the exact same image file.
    const imageTensor = tf.node.decodeImage(image, 3);
    const preprocessedInput = tf.div(tf.sub(imageTensor, [123.68, 116.779, 103.939]), [58.393, 57.12, 57.375]);

    const batchedInput = preprocessedInput.expandDims(0);
    const embeddings = modelWithoutFinalLayer.predict(batchedInput).squeeze();

    embeddings.print();

    return;
}

await main();

型
上面脚本的结果是：

Tensor
    [0, 0, 0, ..., 0, 0, 0.029606]

型
查看两个版本脚本之间输出的前三个值，我预计会有一些变化，但不会这么多。
我该怎么办？这么多的变化是预期的吗？我做错了吗？
如有任何帮助，我们将不胜感激。

相关的Keras issue 18810 "Differences in feature embeddings in Keras and torch models"刚刚关闭，并显示：
首先要检查的是你是否有正确的预处理。最好的做法是总是使用keras.applications.xxxx.preprocess_input，在本例中是keras.applications.resnet.preprocess_input。
我不希望中间特征或预测一定很接近，因为许多模型都是从头开始重新训练的。
您的问题可能部分来自图像在输入模型之前的预处理方式的差异。通过使用Keras模型推荐的方法标准化预处理步骤，您可以确保输入采用特定模型的最佳格式，从而减少一个潜在的差异来源。
关于你对PyTorch和Keras模型之间特征嵌入的显著差异的担忧，相同模型架构的不同实现（如本例中的ResNet152）可能不会产生非常相似的中间特征或预测。
不同框架中的许多模型都是从头开始重新训练的，尽管具有相同的架构，但学习到的功能却存在差异。
这将有助于理解，由于培训过程的差异，即使基本架构相同，产出的这种差异也是常见的和预期的。

keras 从图像中提取特征嵌入

1条答案

相关问题

热门标签

最新问答