python-3.x LSTM层后接BatchNorm Layer维度错误

kxe2p93d  于 2023-11-20  发布在  Python
关注(0)|答案(1)|浏览(102)

我正在尝试使用以下架构训练模型:

self.lstm1 = nn.LSTM(in_channels, hidden_channels, num_layers, batch_first=True, dropout=dropout_prob)
self.batchnorm1 = nn.BatchNorm1d(hidden_channels)

self.lstm2 = nn.LSTM(hidden_channels, hidden_channels, num_layers, batch_first=True, dropout=dropout_prob)
self.batchnorm2 = nn.BatchNorm1d(hidden_channels)

self.lstm3 = nn.LSTM(hidden_channels, hidden_channels, num_layers, batch_first=True, dropout=dropout_prob)
self.batchnorm3 = nn.BatchNorm1d(hidden_channels)

self.fc1 = nn.Linear(hidden_channels, out_channels)

个字符
当我试图训练网络时,当我到达batchnorm1层时,我得到了这个错误:

RuntimeError: running_mean should contain 770 elements not 128


你能告诉我错误在哪里吗?
我尝试使用permeute将输出双介子从(32,770,128)更改为(32,128,770),但仍然得到不同的错误。

nbnkbykc

nbnkbykc1#

批量归一化在lstm层之前执行(无论您将它们添加到网络的顺序如何),因此您应该设置:

in_features = 770
self.batchnorm1 = nn.BatchNorm1d(in_features)
self.lstm1 = nn.LSTM(in_channels, hidden_channels, num_layers, batch_first=True, dropout=dropout_prob)

self.batchnorm2 = nn.BatchNorm1d(hidden_channels)
self.lstm2 = nn.LSTM(hidden_channels, hidden_channels, num_layers, batch_first=True, dropout=dropout_prob)

self.batchnorm3 = nn.BatchNorm1d(hidden_channels)
self.lstm3 = nn.LSTM(hidden_channels, hidden_channels, num_layers, batch_first=True, dropout=dropout_prob)

self.fc1 = nn.Linear(hidden_channels, out_channels)

字符串

相关问题