假设我有一个train.py文件,其中包含训练模型的逻辑,然后将其参数保存到名为weights/
的目录中:
x_train, x_test, y_train, y_test = train_test_split(x, y)
model = compile()
model.fit(x_train, y_train)
model.save_weights("weights/")
另一个文件,即evaluate.py,包含用于评估模型性能的逻辑,其参数将从weights/
目录加载:
x_train, x_test, y_train, y_test = train_test_split(x, y)
model = compile()
model.load_weights("weights/")
model.evaluate(x_test, y_test)
我的问题是:在evaluate.py文件中,语句x_train, x_test, y_train, y_test = train_test_split(x, y)
是否正确,或者我是否应该加载在train.py文件中拆分的相同测试集?在这种情况下,train.py文件将是:
x_train, x_test, y_train, y_test = train_test_split(x, y)
np.save("x_test", x_test)
np.save("y_test", y_test)
model = compile()
model.fit(x_train, y_train)
model.save_weights("weights/")
而evaluate.py文件将是:
x_test = np.load("x_test")
y_test = np.load("y_test")
model = compile()
model.load_weights("weights/")
model.evaluate(x_test, y_test)
1条答案
按热度按时间7d7tgy0s1#
我认为处理评估模型的简单方法是将训练数据和测试数据分开,在训练数据集上模型学习权重,然后在评估阶段检查模型对测试数据的度量。您不需要在evaluate.py中再次拆分数据。我还建议在分割数据集时指定random_state以获得可重现的结果。