在训练lstm编解码器模型时,必须考虑python的丢失和预测

6ju8rftf  于 2021-07-13  发布在  Java
关注(0)|答案(0)|浏览(266)

我是一个初学者的神经网络,并试图建立一个基本的编码器-解码器模型的关系提取。代码中的输入是用于检查代码的小示例(与实际数据的形式相同)。索引**是文本标记的索引,标记**是标记的索引。现在有一个问题,损失和预测都很难。有人能帮忙吗?
以下是输出:

2021-04-27 23:59:22.275718: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-04-27 23:59:22.276153: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-04-27 23:59:25.994977: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
WARNING:tensorflow:Model was constructed with shape (None, 5) for input KerasTensor(type_spec=TensorSpec(shape=(None, 5), dtype=tf.float32, name='embedding_input'), name='embedding_input', description="created by layer 'embedding_input'"), but it was called on an input with incompatible shape (None, 1).
WARNING:tensorflow:Model was constructed with shape (None, 5) for input KerasTensor(type_spec=TensorSpec(shape=(None, 5), dtype=tf.float32, name='embedding_input'), name='embedding_input', description="created by layer 'embedding_input'"), but it was called on an input with incompatible shape (None, 1).
1/1 - 13s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
1/1 - 0s - loss: nan
[[[nan]
  [nan]
  [nan]
  [nan]
  [nan]]

 [[nan]
  [nan]
  [nan]
  [nan]
  [nan]]]

这是我的密码:

import pickle
from keras import Sequential
from keras.layers import Bidirectional, LSTM, Dense, Embedding, Dropout, Activation, Softmax, TimeDistributed

def create_same_length(tokens, tags, max_s):
    W = []
    T = []
    for i in range(len(tokens)):
        w = [0] * (max_s - len(tokens[i]))
        t = [0] * (max_s - len(tokens[i]))
        w += tokens[i]
        W.append(w)
        t += tags[i]
        T.append(t)
    return W, T

def build_model(W_train, T_train, max_s, W_dev, T_dev, souce_size, tag_size):
    model = Sequential()
    model.add(
        Embedding(input_dim=souce_size + 1,
                  output_dim=300,
                  input_length=max_s,
                  mask_zero=True))
    model.add(
        Bidirectional(LSTM(300, return_sequences=True), merge_mode='concat'))
    model.add(LSTM(300, return_sequences=True))
    model.add(TimeDistributed(Dense(1, activation='softmax')))
    model.add(Dropout(0.3))
    model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
    return model

if __name__ == '__main__':
    index_train = [[1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11, 12], [13, 14]]
    tags_train = [[1, 2, 1], [3, 2, 1, 1], [3, 2, 2, 4, 5], [1, 1]]
    max_s_train = 5
    index_dev = [[1, 2, 3], [13, 14, 15]]
    max_s_dev = 3
    tags_dev = [[1, 2, 1], [1, 2, 3]]
    souce_size = 15
    tag_size = 5
    max_s = max(max_s_train, max_s_dev)  #,max_test)
    W_train, T_train = create_same_length(index_train, tags_train, max_s)
    W_dev, T_dev = create_same_length(index_dev, tags_dev, max_s)
    # W_test=create_same_length(index_test,max_s)
    blmodel = build_model(W_train, T_train, max_s, W_dev, T_dev, souce_size,
                          tag_size)
    for epoch in range(5):
        # fit model for one epoch on this sequence
        # blmodel.fit(W_train, T_train, epochs=1, batch_size=32, verbose=2)
        for i in range(len(W_train)):
            blmodel.fit(W_train[i],
                        T_train[i],
                        epochs=1,
                        batch_size=32,
                        verbose=2)
    T_dev_pre = blmodel.predict(W_dev, batch_size=None, verbose=0)
    print(T_dev_pre)

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题