python 在cpu上使用BERT:模块必须在设备cuda:0(device_ids[0])上具有其参数和缓冲区,但在设备:CPU

juud5qan  于 2023-09-29  发布在  Python
关注(0)|答案(2)|浏览(161)

我正在尝试使用我的微调BERT在我的本地机器上进行可视化。模型参数保存在名为trained_model.pt的文件中。当我尝试加载并使用它时,我得到以下错误:

import torch
from pytorch_pretrained_bert import BertTokenizer, BertModel, BertForMaskedLM

# Load pre-trained model tokenizer (vocabulary)
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

# Tokenized input
text = "[CLS] Who was Jim Henson ? [SEP] Jim Henson was a puppeteer [SEP]"
tokenized_text = tokenizer.tokenize(text)

# Mask a token that we will try to predict back with `BertForMaskedLM`
masked_index = 8
tokenized_text[masked_index] = '[MASK]'
assert tokenized_text == ['[CLS]', 'who', 'was', 'jim', 'henson', '?', '[SEP]', 'jim', '[MASK]', 'was', 'a', 'puppet', '##eer', '[SEP]']

# Convert token to vocabulary indices
indexed_tokens = tokenizer.convert_tokens_to_ids(tokenized_text)
# Define sentence A and B indices associated to 1st and 2nd sentences (see paper)
segments_ids = [0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1]

# Convert inputs to PyTorch tensors
tokens_tensor = torch.tensor([indexed_tokens])
segments_tensors = torch.tensor([segments_ids])

# Load pre-trained model (weights)
model = torch.load('trained_model.pt', map_location=torch.device('cpu'))
model.eval()

# Predict hidden states features for each layer
with torch.no_grad():
    encoded_layers, _ = model(tokens_tensor, segments_tensors, output_all_encoded_layers=False)
RuntimeError                              Traceback (most recent call last)
<ipython-input-32-65859554cc9e> in <module>
      5 # Predict hidden states features for each layer
      6 with torch.no_grad():
----> 7     encoded_layers, _ = model(tokens_tensor, segments_tensors, output_all_encoded_layers=False)

/opt/anaconda3/envs/bert/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    725             result = self._slow_forward(*input, **kwargs)
    726         else:
--> 727             result = self.forward(*input, **kwargs)
    728         for hook in itertools.chain(
    729                 _global_forward_hooks.values(),

/opt/anaconda3/envs/bert/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py in forward(self, *inputs, **kwargs)
    151         for t in chain(self.module.parameters(), self.module.buffers()):
    152             if t.device != self.src_device_obj:
--> 153                 raise RuntimeError("module must have its parameters and buffers "
    154                                    "on device {} (device_ids[0]) but found one of "
    155                                    "them on device: {}".format(self.src_device_obj, t.device))

RuntimeError: module must have its parameters and buffers on device cuda:0 (device_ids[0]) but found one of them on device: cpu

有人能解释一下,这是怎么解决的吗?..顺便说一句,我总是遇到DataParallel的问题。不确定是否是由这个错误引起的,但这个功能似乎真的有问题。

pnwntuvh

pnwntuvh1#

当您的模型在多个GPU上使用nn.DataParallel进行训练,并且您试图在CPU上运行它时,会发生此问题。在这种情况下,您需要调整模型,使其与CPU推理兼容
要将这个训练过的模型转换为普通的PyTorch模型,需要使用.module属性访问用nn.DataParallel Package 的内部模型,然后执行推理。

model = model.module
r7xajy2e

r7xajy2e2#

当我使用model = torch.nn.DataParallel(model)进行训练时,我不得不使用torch.save(model.module.state_dict(), PATH)。这解决了问题。

相关问题