Pytorch卷积自动编码器

k5hmc34c  于 2023-01-26  发布在  其他
关注(0)|答案(1)|浏览(205)

嗨,我有一个项目,我需要创建一个卷积自动编码器训练的MNIST数据库,但我的约束是,我不能使用池。我的嵌入尺寸是16,我需要有一个256 * 16 * 1 * 1Tensor作为我的编码器的输出。
我已经编写了以下类来定义我的编码器:

class AutoEncoderCNN(nn.Module):
def __init__(self,nb_channels, embedding_dim):
    super(AutoEncoderCNN, self).__init__()
    self.encoder = nn.Sequential(
        nn.Conv2d(1, 16, kernel_size=5, stride=1),
        nn.ReLU(),
        nn.Conv2d(16, 32, kernel_size=5, stride=1),
        nn.ReLU(),
        nn.Conv2d(32, 64, kernel_size=5, stride=1),
        nn.ReLU(),
        nn.Conv2d(64, 128, kernel_size=5, stride=1),
        nn.ReLU(),
        nn.Conv2d(128, 256, kernel_size=5, stride=1),
        nn.ReLU(),
        nn.Conv2d(128, 256, kernel_size=5, stride=1),
        nn.ReLU()
    )
    self.decoder = nn.Sequential(
        nn.ConvTranspose2d(256, 128, kernel_size=5, stride=1),
        nn.ReLU(),
        nn.ConvTranspose2d(128, 64, kernel_size=5, stride=1),
        nn.ReLU(),
        nn.ConvTranspose2d(64, 32, kernel_size=5, stride=1),
        nn.ReLU(),
        nn.ConvTranspose2d(32, 16, kernel_size=5, stride=1),
        nn.ReLU(),
        nn.ConvTranspose2d(16, 1, kernel_size=5, stride=1),
        nn.Sigmoid()      
    )

def encode(self, x):
    
    x = self.encoder(x)# A COMPLETER
    return x
        
def decode(self, x):
    x = self.decoder(x)# A COMPLETER
    return x
        
def forward(self, x):
    x = self.encoder(x)
    x = self.decoder(x)
    return x

但是当我尝试训练网络时,我遇到了这个维度误差:

RuntimeError: Given groups=1, weight of size [32, 1, 5, 5], expected input[1, 256, 28, 28] to have 1 channels, but got 256 channels instead

我的丧失功能:

loss_function = nn.MSELoss(size_average=None, reduce=None, reduction='mean')

我的优化:

optimizer =  optim.Adam(modelcnn.parameters(), lr=learning_rate)

我的数据加载器:

mnistTrainLoader = DataLoader(mnistTrainSet_clean, batch_size=batch_size,shuffle=True, num_workers=0)

我的列车循环:

# Procédure d'entrainement du model, en utilisant un dataloader, un optimiseur et le nombre d'époques
def train(model, data_loader, opt, n_epochs):
losses = []  
i=0
for epoch in range(n_epochs):  # Boucle sur les époques
    running_loss = 0.0

    for features, labels in data_loader:      

        # A COMPLETER
        #Propagation en avant
        labels_pred = model(features) # Equivalent à model.forward(features)
         

        #Calcul du coût
        loss = loss_function(labels_pred,labels)

        #on sauvegarde la loss pour affichage futur
        losses.append(loss.item())
        
        #Effacer les gradients précédents
        optimizer.zero_grad()

        #Calcul des gradients (rétro-propagation)
        loss.backward()

        # Mise à jour des poids : un pas de l'optimiseur
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        if i % 10 == 9:    
            print('[Epoque : %d, iteration: %5d] loss: %.3f'%
                  (epoch + 1, i + 1, running_loss / 10))
            running_loss = 0.0
        i+=1   

print('Entrainement terminé')
return losses

我试过很多方法来解决这个问题,但都不起作用。有人能帮我吗?

gstyhher

gstyhher1#

在编码器中,您重复:

nn.Conv2d(128, 256, kernel_size=5, stride=1),
nn.ReLU(),
nn.Conv2d(128, 256, kernel_size=5, stride=1),
nn.ReLU()

只要删除重复的部分,形状就会合适。
注:作为编码器的输出,您将具有batch_size * 256 * h' * w'的形状。256是作为编码器中最后一次卷积输出的通道数,h'w'将取决于输入图像h, w在通过卷积层后的大小。
你用的是nb_channels,而embedding_dim没有任何地方,我不明白你说的embedding_dim是什么意思,因为你只使用了卷积,没有使用连接层。

==========编辑==========

在down注解中的对话框之后,我将在这里让这段代码激励你--我希望--(并告诉我它是否工作)

from torch import nn
import torch
import torch
from torch.utils.data import Dataset
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor

data = datasets.MNIST(root='data', train=True, download=True, transform=ToTensor())

class AutoEncoderCNN(nn.Module):
  def __init__(self):
    super(AutoEncoderCNN, self).__init__()
    self.encoder = nn.Sequential(
        nn.Conv2d(1, 32, kernel_size=5, stride=1),
        nn.ReLU(),
        nn.Conv2d(32, 64, kernel_size=5, stride=1),
        nn.ReLU(),
        nn.Conv2d(64, 128, kernel_size=5, stride=1),
        nn.ReLU(),
        nn.Conv2d(128, 256, kernel_size=5, stride=1),
        nn.ReLU(),
    )
    self.decoder = nn.Sequential(
        nn.ConvTranspose2d(256, 128, kernel_size=5, stride=1),
        nn.ReLU(),
        nn.ConvTranspose2d(128, 64, kernel_size=5, stride=1),
        nn.ReLU(),
        nn.ConvTranspose2d(64, 32, kernel_size=5, stride=1),
        nn.ReLU(),
        nn.ConvTranspose2d(32, 1, kernel_size=5, stride=1),
        nn.Sigmoid()      
    )
          
  def forward(self, x):
      x = self.encoder(x)
      x = self.decoder(x)
      return x
  
model = AutoEncoderCNN()
mnistTrainLoader = DataLoader(data,
                              batch_size=32, shuffle=True, num_workers=0)

loss_function = nn.MSELoss(size_average=None, reduce=None, reduction='mean')
optimizer =  torch.optim.Adam(model.parameters(), lr=1e-3)
losses = []
i = 0
running_loss = .0
for epoch in range(100):
  for features, _ in mnistTrainLoader:
    y = model(features)
    loss = loss_function(y, features)
    losses.append(loss.item())
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    running_loss += loss.item()
    if i % 10 == 9:    
        print('[Epoque : %d, iteration: %5d] loss: %.3f'%
              (epoch + 1, i + 1, running_loss / 10))
        running_loss = 0.0
    i+=1

======添加通道维度======

问题实际上是在创建数据集时,因为数据集包含灰度图像,PyTorch MNIST数据集助手返回的图像没有通道的维度。卷积需要这个维度,所以我们需要添加它。
而不是以这种方式加载数据集:

X_train = torchvision.datasets.MNIST(root='./data', train=True, download=True, transform=transforms.ToTensor()).data
print(X_train.shape) # torch.Size([60000, 28, 28])

我们这样加载:

X_train = torchvision.datasets.MNIST(root='./data', train=True, download=True).data[:,None,:,:]/255.
# /255. to have floats between 0 and 1 instead of unsigned int
print(X_train.shape) # torch.Size([60000, 1, 28, 28])

处理这个问题的另一种方法是在model类中,将通道维度添加到输入x

相关问题