tensorflow 在单独的python文件中加载测试数据以进行模型训练和评估的最佳实践：我应该再次train_test_split还是加载测试数据？

hs1ihplo 于 2023-10-23 发布在 Python

关注(0)|答案(1)|浏览(128)

假设我有一个train.py文件，其中包含训练模型的逻辑，然后将其参数保存到名为weights/的目录中：

x_train, x_test, y_train, y_test = train_test_split(x, y)
model = compile()
model.fit(x_train, y_train)
model.save_weights("weights/")

另一个文件，即evaluate.py，包含用于评估模型性能的逻辑，其参数将从weights/目录加载：

x_train, x_test, y_train, y_test = train_test_split(x, y)
model = compile()
model.load_weights("weights/")
model.evaluate(x_test, y_test)

我的问题是：在evaluate.py文件中，语句x_train, x_test, y_train, y_test = train_test_split(x, y)是否正确，或者我是否应该加载在train.py文件中拆分的相同测试集？在这种情况下，train.py文件将是：

x_train, x_test, y_train, y_test = train_test_split(x, y)
np.save("x_test", x_test) 
np.save("y_test", y_test) 
model = compile()
model.fit(x_train, y_train)
model.save_weights("weights/")

而evaluate.py文件将是：

x_test = np.load("x_test")
y_test = np.load("y_test")
model = compile()
model.load_weights("weights/")
model.evaluate(x_test, y_test)

tensorflow

来源：https://stackoverflow.com/questions/77307413/best-practice-for-loading-test-data-in-separate-python-files-for-model-training

1条答案

按热度按时间

7d7tgy0s1#

我认为处理评估模型的简单方法是将训练数据和测试数据分开，在训练数据集上模型学习权重，然后在评估阶段检查模型对测试数据的度量。您不需要在evaluate.py中再次拆分数据。我还建议在分割数据集时指定random_state以获得可重现的结果。

train_test_split(X, y, random_state=42)

赞(0）回复(0）举报 2023-10-23

我来回答

tensorflow 在单独的python文件中加载测试数据以进行模型训练和评估的最佳实践：我应该再次train_test_split还是加载测试数据？

1条答案

相关问题

热门标签

最新问答