pytorch 在CPU上加载经过GPU训练的BERTopic模型？

pkln4tw6 于 2023-01-20 发布在其他

关注(0)|答案(1)|浏览(457)

我在GPU上训练了一个BERTopic模型，现在为了可视化，我想将它加载到CPU上。但是当我尝试这样做时，我得到了：RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
当我尝试使用建议的修复程序时，我遇到了同样的问题？看到一些修复程序建议保存模型而不保存嵌入模型，但不想重新训练重新保存，除非这是最后一个选项，如果有人能解释一下这个嵌入模型是什么，以及在引擎盖下发生了什么，我会很高兴。

topic_model = torch.load(args.model, map_location=torch.device('cpu'))

pytorch

来源：https://stackoverflow.com/questions/74860769/loading-a-gpu-trained-bertopic-model-on-cpu

1条答案

按热度按时间

vu8f3i0k1#

如果要保存不带嵌入模型的BERTopic模型，可以运行以下命令：

from bertopic import BERTopic
from sklearn.datasets import fetch_20newsgroups
from sentence_transformers import SentenceTransformer

docs = fetch_20newsgroups(subset='all',  remove=('headers', 'footers', 'quotes'))['data']

# Train the model
embedding_model = SentenceTransformer("all-MiniLM-L6-v2")
topic_model = BERTopic(embedding_model=embedding_model)
topics, probs = topic_model.fit_transform(docs)

# Save the model without the embedding model
topic_model.save("my_model", save_embedding_model=False)

如果您未在BERTopic中使用任何cuML子模型，这应可防止GPU/CPU出现任何问题。
看到一些修复，建议保存模型没有其嵌入模型，但不想重新训练一个重新保存，除非它的最后一个选项，也希望有人可以解释什么是这个嵌入模型和引擎盖下发生了什么。
嵌入模型通常是一个预先训练的模型，实际上并不从输入数据中学习。有一些选项可以让它在训练过程中学习，但这需要BERTopic中的自定义组件。换句话说，当您使用预先训练的模型时，在保存主题模型时删除该预先训练的模型是没有问题的，因为不需要重新训练模型。
换句话说，我们首先在GPU环境中保存主题模型，而不保存嵌入模型：

topic_model.save("my_model", save_embedding_model=False)

然后，我们将保存的BERTopic模型加载到CPU环境中，然后传递预训练的嵌入模型：

from sentence_transformers import SentenceTransformer

embedding_model = SentenceTransformer("all-MiniLM-L6-v2")
topic_model = BERTopic.load("my_model", embedding_model=embedding_model )

您可以了解更多关于嵌入模型here的角色。

赞(0）回复(0）举报 2023-01-20

我来回答

pytorch 在CPU上加载经过GPU训练的BERTopic模型？

1条答案

相关问题

热门标签

最新问答