numpy 无法为训练图像数据集选择正确的形状

ep6jt1vc  于 2023-06-23  发布在  其他
关注(0)|答案(1)|浏览(97)

我正在尝试使用我开发的CNN架构来训练不平衡数据模型并评估其性能。
这是我的代码:

# Import necessary libraries
from sklearn.utils import shuffle
import numpy as np
import tensorflow as tf

# Convert one-hot encoded labels back to their integer form
train_labels_copy = np.argmax(train_labels, axis=1)

# Split data into airplane/car and others
airplane_car_indices = np.where((train_labels_copy == 0) | (train_labels_copy == 1))[0]
other_indices = np.where((train_labels_copy != 0) & (train_labels_copy != 1))[0]

# Separate airplane/car and other images and labels
airplane_car_images = train_images[airplane_car_indices]
airplane_car_labels = train_labels[airplane_car_indices]  # use train_labels instead of train_labels_copy
other_images = train_images[other_indices]
other_labels = train_labels[other_indices]  # use train_labels instead of train_labels_copy

# Shuffle the airplane/car indices
np.random.shuffle(airplane_car_indices)

# Calculate 20% of the airplane and car class data
remove_n = int(0.2 * len(airplane_car_indices))

# Keep only 20% of airplane and car class data
airplane_car_images = airplane_car_images[:remove_n]
airplane_car_labels = airplane_car_labels[:remove_n]

# Combine imbalanced airplane/car and other data
train_images_imbalanced = np.concatenate((airplane_car_images, other_images))
train_labels_imbalanced = np.concatenate((airplane_car_labels, other_labels))

# Shuffle the imbalanced data
train_images_imbalanced, train_labels_imbalanced = shuffle(train_images_imbalanced, train_labels_imbalanced)

# Train the same model on the imbalanced data
history_imbalanced = cnn.fit(train_images_imbalanced, train_labels_imbalanced, batch_size=32, epochs=20, validation_data=(test_images, test_labels))

# Plot the loss and accuracy graphs
plot_loss_and_accuracy(history_imbalanced)

# Predict the test data
predictions_imbalanced = cnn.predict(test_images)

# Convert prediction probabilities to class labels
predictions_imbalanced = np.argmax(predictions_imbalanced, axis=1)

# Print the classification report
print(classification_report(np.argmax(test_labels, axis=1), predictions_imbalanced, target_names=class_names))

错误是:

Epoch 1/20
1312/1313 [============================>.] - ETA: 0s - loss: 0.4442 - accuracy: 0.8443
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[15], line 37
     34 train_images_imbalanced, train_labels_imbalanced = shuffle(train_images_imbalanced, train_labels_imbalanced)
     36 # Train the same model on the imbalanced data
---> 37 history_imbalanced = cnn.fit(train_images_imbalanced, train_labels_imbalanced, batch_size=32, epochs=20, validation_data=(test_images, test_labels))
     39 # Plot the loss and accuracy graphs
     40 plot_loss_and_accuracy(history_imbalanced)

File c:\Python311\Lib\site-packages\keras\utils\traceback_utils.py:70, in filter_traceback..error_handler(*args, **kwargs)
     67     filtered_tb = _process_traceback_frames(e.__traceback__)
     68     # To get the full stack trace, call:
     69     # `tf.debugging.disable_traceback_filtering()`
---> 70     raise e.with_traceback(filtered_tb) from None
     71 finally:
     72     del filtered_tb

File ~\AppData\Local\Temp\__autograph_generated_filesf8xdq9z.py:15, in outer_factory..inner_factory..tf__test_function(iterator)
     13 try:
     14     do_return = True
---> 15     retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
     16 except:
     17     do_return = False

ValueError: in user code:
...
    File "c:\Python311\Lib\site-packages\keras\backend.py", line 5559, in categorical_crossentropy
        target.shape.assert_is_compatible_with(output.shape)

我尝试解决这个问题,使用Keras utils中的to_categorical函数将train_labels_imbalanced中的标签转换为分类格式。我替换了以下行:

train_labels_imbalanced = np.concatenate((airplane_car_labels, other_labels))

其中:

train_labels_imbalanced = np.concatenate((to_categorical(airplane_car_labels, num_classes=10), other_labels))

但我得到了这个错误:

ValueError                                Traceback (most recent call last)
Cell In[14], line 31
     29 # Combine imbalanced airplane/car and other data
     30 train_images_imbalanced = np.concatenate((airplane_car_images, other_images))
---> 31 train_labels_imbalanced = np.concatenate((to_categorical(airplane_car_labels, num_classes=10), other_labels))
     33 # Shuffle the imbalanced data
     34 train_images_imbalanced, train_labels_imbalanced = shuffle(train_images_imbalanced, train_labels_imbalanced)

File <__array_function__ internals>:200, in concatenate(*args, **kwargs)

ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 3 dimension(s) and the array at index 1 has 2 dimension(s)
kx5bkwkv

kx5bkwkv1#

我怀疑您的验证数据形状出现了问题:history_imbalanced = cnn.fit(train_images_imbalanced,train_labels_imbalanced,batch_size=32,epochs=20,validation_data=(test_images,test_labels))
那么在这里,你如何创建test_images和test_labels呢?正如你在第一个日志中看到的,epoch训练已经完成,但是当开始评估验证数据时,错误(可能你忘记在test_labels上应用hot_coding)被提升了。

相关问题