pytorch 如何使用微调过的huggingface模型进行推理?

5us2dqdw  于 2023-03-02  发布在  其他
关注(0)|答案(1)|浏览(154)

我已经使用the IMDB dataset微调了一个Huggingface模型,并且我能够使用训练器通过trainer.predict(test_ds_encoded)对测试集进行预测,但是,当使用具有虚拟标签特性(全部为-1,而不是0和1)的推理集进行同样的操作时,训练器抛出了一个错误:

/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [0,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [1,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [2,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [3,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [4,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [5,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [6,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [7,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [8,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [9,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [10,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [11,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [12,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [13,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [14,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [15,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [16,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [17,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [18,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [19,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [20,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [21,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [22,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [23,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [24,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [25,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [26,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [27,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [28,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [29,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [30,0,0] Assertion `t >= 0 && t < n_classes` failed.
/usr/local/src/pytorch/aten/src/ATen/native/cuda/Loss.cu:257: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [31,0,0] Assertion `t >= 0 && t < n_classes` failed.
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
/tmp/ipykernel_23/4156768683.py in <module>
----> 1 trainer.predict(inference_ds_encoded)

/opt/conda/lib/python3.7/site-packages/transformers/trainer.py in predict(self, test_dataset, ignore_keys, metric_key_prefix)
  2694         eval_loop = self.prediction_loop if self.args.use_legacy_prediction_loop else self.evaluation_loop
  2695         output = eval_loop(
-> 2696             test_dataloader, description="Prediction", ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix
  2697         )
  2698         total_batch_size = self.args.eval_batch_size * self.args.world_size

/opt/conda/lib/python3.7/site-packages/transformers/trainer.py in evaluation_loop(self, dataloader, description, prediction_loss_only, ignore_keys, metric_key_prefix)
  2819                 )
  2820             if logits is not None:
-> 2821                 logits = self._pad_across_processes(logits)
  2822                 logits = self._nested_gather(logits)
  2823                 if self.preprocess_logits_for_metrics is not None:

/opt/conda/lib/python3.7/site-packages/transformers/trainer.py in _pad_across_processes(self, tensor, pad_index)
  2953             return tensor
  2954         # Gather all sizes
-> 2955         size = torch.tensor(tensor.shape, device=tensor.device)[None]
  2956         sizes = self._nested_gather(size).cpu()
  2957 

RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

然后我删除了标签功能:trainer.predict(inference_ds_encoded.remove_columns('label')),但仍然出现错误:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
/tmp/ipykernel_23/899960315.py in <module>
----> 1 trainer.predict(inference_ds_encoded.remove_columns('label'))

/opt/conda/lib/python3.7/site-packages/transformers/trainer.py in predict(self, test_dataset, ignore_keys, metric_key_prefix)
   2694         eval_loop = self.prediction_loop if self.args.use_legacy_prediction_loop else self.evaluation_loop
   2695         output = eval_loop(
-> 2696             test_dataloader, description="Prediction", ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix
   2697         )
   2698         total_batch_size = self.args.eval_batch_size * self.args.world_size

/opt/conda/lib/python3.7/site-packages/transformers/trainer.py in evaluation_loop(self, dataloader, description, prediction_loss_only, ignore_keys, metric_key_prefix)
   2796 
   2797             # Prediction step
-> 2798             loss, logits, labels = self.prediction_step(model, inputs, prediction_loss_only, ignore_keys=ignore_keys)
   2799             inputs_decode = inputs["input_ids"] if args.include_inputs_for_metrics else None
   2800 

/opt/conda/lib/python3.7/site-packages/transformers/trainer.py in prediction_step(self, model, inputs, prediction_loss_only, ignore_keys)
   2999         """
   3000         has_labels = all(inputs.get(k) is not None for k in self.label_names)
-> 3001         inputs = self._prepare_inputs(inputs)
   3002         if ignore_keys is None:
   3003             if hasattr(self.model, "config"):

/opt/conda/lib/python3.7/site-packages/transformers/trainer.py in _prepare_inputs(self, inputs)
   2261         handling potential state.
   2262         """
-> 2263         inputs = self._prepare_input(inputs)
   2264         if len(inputs) == 0:
   2265             raise ValueError(

/opt/conda/lib/python3.7/site-packages/transformers/trainer.py in _prepare_input(self, data)
   2243         """
   2244         if isinstance(data, Mapping):
-> 2245             return type(data)({k: self._prepare_input(v) for k, v in data.items()})
   2246         elif isinstance(data, (tuple, list)):
   2247             return type(data)(self._prepare_input(v) for v in data)

/opt/conda/lib/python3.7/site-packages/transformers/trainer.py in <dictcomp>(.0)
   2243         """
   2244         if isinstance(data, Mapping):
-> 2245             return type(data)({k: self._prepare_input(v) for k, v in data.items()})
   2246         elif isinstance(data, (tuple, list)):
   2247             return type(data)(self._prepare_input(v) for v in data)

/opt/conda/lib/python3.7/site-packages/transformers/trainer.py in _prepare_input(self, data)
   2253                 # may need special handling to match the dtypes of the model
   2254                 kwargs.update(dict(dtype=self.args.hf_deepspeed_config.dtype()))
-> 2255             return data.to(**kwargs)
   2256         return data
   2257 

RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

我还尝试使用经过训练的模型对象进行预测,但得到了不同的错误:

text = ["I like the film it's really exciting!", "I hate the movie, it's so boring!!"]
encoding = tokenizer(text)
outputs = model(**encoding)
predictions = outputs.logits.argmax(-1)

错误:

AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_23/94414684.py in <module>
      1 text = ["I like the film it's really exciting!", "I hate the movie, it's so boring!!"]
      2 encoding = tokenizer(text)
----> 3 outputs = model(**encoding)
      4 predictions = outputs.logits.argmax(-1)

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1108         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1109                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1110             return forward_call(*input, **kwargs)
   1111         # Do not call functions when jit is used
   1112         full_backward_hooks, non_full_backward_hooks = [], []

/opt/conda/lib/python3.7/site-packages/transformers/models/distilbert/modeling_distilbert.py in forward(self, input_ids, attention_mask, head_mask, inputs_embeds, labels, output_attentions, output_hidden_states, return_dict)
    752             output_attentions=output_attentions,
    753             output_hidden_states=output_hidden_states,
--> 754             return_dict=return_dict,
    755         )
    756         hidden_state = distilbert_output[0]  # (bs, seq_len, dim)

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1108         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1109                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1110             return forward_call(*input, **kwargs)
   1111         # Do not call functions when jit is used
   1112         full_backward_hooks, non_full_backward_hooks = [], []

/opt/conda/lib/python3.7/site-packages/transformers/models/distilbert/modeling_distilbert.py in forward(self, input_ids, attention_mask, head_mask, inputs_embeds, output_attentions, output_hidden_states, return_dict)
    549             raise ValueError("You cannot specify both input_ids and inputs_embeds at the same time")
    550         elif input_ids is not None:
--> 551             input_shape = input_ids.size()
    552         elif inputs_embeds is not None:
    553             input_shape = inputs_embeds.size()[:-1]

AttributeError: 'list' object has no attribute 'size'

我的代码可以在Kaggle上找到:https://www.kaggle.com/code/georgeliu/imdb-text-classification-with-transformers.

qco9c6ql

qco9c6ql1#

正如@Enes Altınışık在评论中指出的那样,使用CPU重新运行代码(从而重新启动Kaggle示例)起作用了,之后,我用GPU加速尝试了相同的代码,它也起作用了。

相关问题