python-3.x 如何使用预训练LSTM保存的模型进行新分类

ymzxtsji  于 2023-02-26  发布在  Python
关注(0)|答案(1)|浏览(163)

我有一个用Keras和Tensorflow构建的简单的预训练LSTM模型,我训练、编译和拟合它,用一句简单的话做了一个测试预测,它起作用了,然后我用model.save(sentanalysis.h5保存了我的模型,一切正常。然后,我用model.load_model()加载了这个模型,它加载没有错误,但是当我尝试model.predict()时,我得到了一个浮点数数组,它没有显示任何与类相关的内容:
如何使用预先训练好的模型来进行新的分类?我用来训练它的数据集非常简单,一个包含textsentiment列的csv,没有其他内容。您能帮我吗?这是模型的代码:

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import nlp
import random
from keras.preprocessing.text import Tokenizer
from keras_preprocessing.sequence import pad_sequences

dataset = nlp.load_dataset('csv', data_files={'train':'/content/drive/MyDrive/Proyect/BehaviorClassifier/tass2019_pe_train_final.csv',
                                              'test': '/content/drive/MyDrive/Proyect/BehaviorClassifier/tass2019_pe_test_final.csv',
                                              'validation': '/content/drive/MyDrive/Proyect/BehaviorClassifier/tass2019_pe_val_final.csv'})
train = dataset['train']
val = dataset['validation']
test = dataset['test']

def get_tweet(data):
    tweets = [x['Text'] for x in data]
    labels = [x['behavior'] for x in data]
    return tweets, labels

tweets, labels = get_tweet(train)

tokenizer = Tokenizer(num_words=10000, oov_token='<UNK>')
tokenizer.fit_on_texts(tweets)

maxlen = 140

def get_sequences(tokenizer, tweets):
    sequences = tokenizer.texts_to_sequences(tweets)
    padded = pad_sequences(sequences, truncating='post', padding='post', maxlen=maxlen)
    return padded

padded_train_seq = get_sequences(tokenizer, tweets)

classes = set(labels)
class_to_index = dict((c, i) for i, c in enumerate(classes))
index_to_class = dict((v, k) for k, v in class_to_index.items())
names_to_ids = lambda labels: np.array([class_to_index.get(x) for x in labels])
train_labels = names_to_ids(labels)

model = tf.keras.models.Sequential([
    tf.keras.layers.Embedding(10000, 16, input_length=maxlen),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(20, return_sequences=True)),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(20)),
    tf.keras.layers.Dense(6, activation='softmax')
])
model.compile(
    loss='sparse_categorical_crossentropy',
    optimizer='adam',
    metrics=['accuracy']
)

val_tweets, val_labels = get_tweet(val)
val_seq = get_sequences(tokenizer, val_tweets)
val_labels= names_to_ids(val_labels)
h = model.fit(
     padded_train_seq, train_labels,
     validation_data=(val_seq, val_labels),
     epochs=8#,
     #callbacks=[tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=2)]
)

test_tweets, test_labels=get_tweet(test)
test_seq = get_sequences(tokenizer, test_tweets)
test_labels=names_to_ids(test_labels)
model.evaluate(test_seq, test_labels)

# This code works when I loaded the previos code
sentence = 'I am very happy now'
sequence = tokenizer.texts_to_sequences([sentence])
paddedSequence = pad_sequences(sequence, truncating = 'post', padding='post', maxlen=maxlen)
p = model.predict(np.expand_dims(paddedSequence[0], axis=0))[0]
pred_class=index_to_class[np.argmax(p).astype('uint8')]
print('Sentence: ', sentence)
print('Sentiment: ', pred_class)

这就是我如何保存和加载我的模型而不加载前面的代码:

model.save('/content/drive/MyDrive/Proyect/BehaviorClassifier/twitterBehaviorClassifier.h5')
model = keras.models.load_model('/content/drive/MyDrive/Proyect/BehaviorClassifier/twitterBehaviorClassifier.h5')

#### ISSUE HERE
new = ["I am very happy"]
tokenizer = Tokenizer(num_words=10000, oov_token='<UNK>')
tokenizer.fit_on_texts(new)
seq = tokenizer.texts_to_sequences(new)
padded = pad_sequences(seq, maxlen=140)
pred = model.predict(padded)

我得到了这个:

1/1 [==============================] - 0s 29ms/step
[[7.0648360e-01 1.1568426e-01 1.7581969e-01 7.2872970e-04 4.2903548e-04
  8.5460022e-04]]

我阅读过一些医生,但都没用。

n9vozmp4

n9vozmp41#

因此,从您的模型代码中,您有以下内容:
tf.keras.layers.Dense(6, activation='softmax')
假设你有6个不同的情感类,你从model.predict()看到的输出是输入属于相应类的概率,即70.6%的概率是情感类0,11.5%的概率是情感类1,17.5%的概率是情感类2,等等。
因此,通常对这些结果进行后处理的方法是,使用np.argmax(pred)将最大概率作为预测值,在这种情况下,你发布的消息应该给予0,然后可以解释为你的模型认为你的推文属于零类的可能性为70. 6%。

相关问题