pytorch 如何从Huggingface的管道方法中获得损失,以便对模型进行微调?

epggiuax  于 2022-11-09  发布在  其他
关注(0)|答案(1)|浏览(177)

我尝试在huggingface上使用this model进行QA。它的代码在链接中:

from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline
model_name = "deepset/roberta-base-squad2"

# a) Get predictions

nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)
QA_input = {
    'question': 'Why is model conversion important?',
    'context': 'The option to convert models between FARM and transformers gives freedom to the user and let people easily switch between frameworks.'
}
res = nlp(QA_input)

# b) Load model & tokenizer

model = AutoModelForQuestionAnswering.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

print(res)
>>>
{'score': 0.2117144614458084,
 'start': 59,
 'end': 84,
 'answer': 'gives freedom to the user'}

然而,我不知道如何获得一个损失,以便我可以微调这个模型。我正在看huggingface tutorial,但没有看到任何东西,除了使用Trainer方法或链接中的另一个训练方法(这不是QA):

import torch
from transformers import AdamW, AutoTokenizer, AutoModelForSequenceClassification

# Same as before

checkpoint = "bert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForSequenceClassification.from_pretrained(checkpoint)
sequences = [
    "I've been waiting for a HuggingFace course my whole life.",
    "This course is amazing!",
]
batch = tokenizer(sequences, padding=True, truncation=True, return_tensors="pt")

# This is new

batch["labels"] = torch.tensor([1, 1])

optimizer = AdamW(model.parameters())
loss = model(**batch).loss
loss.backward()
optimizer.step()

假设正确的答案是freedom to the user而不是gives freedom to the user

db2dz4w8

db2dz4w81#

您不必为此感到失落。Hugginface中有一个Trainer类,您可以使用它来训练您的模型。它也针对Hugginface模型进行了优化,包含了许多您可能感兴趣的不同类型的深度学习最佳实践。请参见此处:https://huggingface.co/docs/transformers/main_classes/trainer

相关问题