keras Python自动编码器返回混合图像

iqxoj9l9  于 2023-02-04  发布在  Python
关注(0)|答案(1)|浏览(145)

我试图创建Python自动编码器图像压缩,但我得到的是混合的图像。
下面是我的代码:

path1 = 'C:\\Users\\klaud\\Desktop\\images\\'
all_images = []
subjects = os.listdir(path1)
numberOfSubject = len(subjects)
print('Number of Subjects: ', numberOfSubject)
for number1 in range(0, numberOfSubject):  # numberOfSubject
  path2 = (path1 + subjects[number1] + '/')
  sequences = os.listdir(path2)
  numberOfsequences = len(sequences)
  for number2 in range(0, numberOfsequences):
    path3 = path2 + sequences[number2]
    img = cv2.imread(path3, 0)
    img = img.reshape(512, 512, 1)
    all_images.append(img)

x_train = np.array([all_images[0], all_images[1]])
x_test = np.array(all_images[2:])

print("X TRAIN \n")
print(x_train)
print("X TEST \n")
print(x_test)

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.

print (x_train.shape)
print (x_test.shape)

latent_dim = 4

class Autoencoder(Model):
  def __init__(self, latent_dim):
    super(Autoencoder, self).__init__()
    self.latent_dim = latent_dim
    self.encoder = tf.keras.Sequential([
      layers.Flatten(),
      layers.Dense(latent_dim, activation='relu'),
    ])
    self.decoder = tf.keras.Sequential([
      layers.Dense(262144, activation='sigmoid'),
      layers.Reshape((512, 512))
    ])

  def call(self, x):
    encoded = self.encoder(x)
    decoded = self.decoder(encoded)
    return decoded

autoencoder = Autoencoder(latent_dim)

autoencoder.compile(optimizer='adam', loss=losses.MeanSquaredError())

autoencoder.fit(x_train, x_train,
                epochs=10,
                shuffle=True,
                validation_data=(x_test, x_test))

encoded_imgs = autoencoder.encoder(x_test).numpy()
decoded_imgs = autoencoder.decoder(encoded_imgs).numpy()


n = 6
plt.figure(figsize=(20, 6))
for i in range(n):
  # display original
  ax = plt.subplot(2, n, i + 1)
  plt.imshow(x_test[i])
  plt.title("original")
  plt.gray()
  ax.get_xaxis().set_visible(False)
  ax.get_yaxis().set_visible(False)

  # display reconstruction
  ax = plt.subplot(2, n, i + 1 + n)
  plt.imshow(decoded_imgs[i])
  plt.title("reconstructed")
  plt.gray()
  ax.get_xaxis().set_visible(False)
  ax.get_yaxis().set_visible(False)
plt.show()

这就是我得到的结果:

我不知道这是自动解码器的问题还是matplotlib显示图形的方式有问题?我试着改变了几乎所有的东西,如果我在编译程序时没有得到错误,那么它是混淆的图像。我很感激任何有用的建议!

8hhllhi2

8hhllhi21#

这看起来像您只训练了两个图像:

x_train = np.array([all_images[0], all_images[1]])

在这种情况下,网络可以通过学习重现微小的输入空间来达到(局部)最小值--不管输入如何,它只会吐出两种图像的混合物。
要解决这个问题,您绝对需要一个更大的训练数据集!它不需要是ImageNet,但可能像the Stanford dogs这样的东西会很合适。

相关问题