pytorch 如何更改输入形状，使3DCNN中没有错误？

vyswwuz2 于 2023-08-05 发布在其他

关注(0)|答案(1)|浏览(100)

我正在尝试创建一个3dcnn分类模型，该模型将10个（320，864）大小的灰度图像作为输入，并输出一个输出。运行训练时，弹出以下错误：
运行时间错误：给定groups=1，权重大小为[16，1，3，3，3]，预期输入[2，10，1，320，864]有1个通道，但实际上得到了10个通道
下面的代码是加载输入的代码。我想让输入形状[2，1，10，320，864]。我应该在加载输入的代码中修改什么？

def __getitem__(self, idx):
  video_idx = idx // 260
  frame_idx = idx % 260 + 41
  video_dir = self.video_dirs[video_idx]
  frames = []

  # Get all files in the directory
  all_files = os.listdir(video_dir)

  # Select only .jpg files
  jpg_files = [file for file in all_files if file.endswith('.jpg')]

  # Extract the number from the file name and sort
  numbered_files = sorted(jpg_files, key=lambda x: int(re.findall(r'\d+', x)[-1]))

  for i in range(frame_idx, frame_idx + 10):
      # Get the file with the corresponding number
      frame_file = numbered_files[i-1]  # -1 because indexing starts from 0
      frame_path = os.path.join(video_dir, frame_file)
      print(f"Loading image from {frame_path}")
      frame = cv2.imread(frame_path)
      if frame is None:
          raise ValueError(f"Could not load image at {frame_path}")
      frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)  # Convert to grayscale
      frame = frame.reshape(1, frame.shape[0], frame.shape[1])  # Add channel dimension
      frame = torch.tensor(frame, dtype=torch.float32)
      print(frame.shape)
      frames.append(frame)

  frames = torch.stack(frames)  # Stack frames along the channel dimension
  label = self.labels[idx]
  return frames, label

字符串

pytorch

来源：https://stackoverflow.com/questions/76718249/how-can-i-change-the-input-shape-so-that-there-is-no-error-in-3dcnn

1条答案

按热度按时间

yfwxisqw1#

在这里的for循环中，你正在创建形状为(1, 320, 864)的数据，这些数据将被堆叠成形状为(10, 1, 320, 864)的数据，因为torch.stack沿着一个新的前导维度连接Tensor列表。最后，数据加载器将使用新的维度对视频进行批处理，因此最终的批处理形状是(2, 10, 1, 320, 864)，这里的问题是模型期望第3个轴是空间/时间轴，第2个轴是通道轴。这里的情况正好相反。要解决这个问题，你应该简单地删除for循环中的以下行：

frame = frame.reshape(1, frame.shape[0], frame.shape[1])

字符串
并在**torch.stack后增加通道尺寸**：

frames = frames.reshape(1, frames.shape[0], frames.shape[1], frames.shape[2])

型
现在，单个数据的形状（批处理之前）如预期的那样是(1, 10, 320, 864)。

赞(0）回复(0）举报 2023-08-05

我来回答

pytorch 如何更改输入形状，使3DCNN中没有错误？

1条答案

相关问题

热门标签

最新问答