描述错误
我正在尝试在谷歌Colab上运行LLM_few-shot示例(https://github.com/ludwig-ai/ludwig/blob/master/examples/llm_few_shot_learning/simple_model_training.py),并在训练阶段遇到以下错误。
====日志====
INFO:ludwig.models.llm:Done.
INFO:ludwig.utils.tokenizers:Loaded HuggingFace implementation of facebook/opt-350m tokenizer
INFO:ludwig.trainers.trainer:Tuning batch size...
INFO:ludwig.utils.batch_size_tuner:Tuning batch size...
INFO:ludwig.utils.batch_size_tuner:Exploring batch_size=1
INFO:ludwig.utils.checkpoint_utils:Successfully loaded model weights from /tmp/tmpnqj9shge/latest.ckpt.
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
[<ipython-input-15-cbbfc30da30b>](https://localhost:8080/#) in <cell line: 6>()
4 preprocessed_data, # tuple Ludwig Dataset objects of pre-processed training data
5 output_directory, # location of training results stored on disk
----> 6 ) = model.train(
7 dataset=df,experiment_name="simple_experiment", model_name="simple_model", skip_save_processed_input=False)
8
10 frames
[/usr/local/lib/python3.10/dist-packages/ludwig/models/llm.py](https://localhost:8080/#) in _remove_left_padding(self, input_ids_sample)
629 else:
630 pad_idx = 0
--> 631 input_ids_sample_no_padding = input_ids_sample[pad_idx + 1 :]
632
633 # Start from the first BOS token
IndexError: Dimension specified as 0 but tensor has no dimensions
====日志结束====
配置如下
config = yaml.unsafe_load(
"""
model_type: llm
model_name: facebook/opt-350m
generation:
temperature: 0.1
top_p: 0.75
top_k: 40
num_beams: 4
max_new_tokens: 64
prompt:
task: "Classify the sample input as either negative, neutral, or positive."
retrieval:
type: semantic
k: 3
model_name: paraphrase-MiniLM-L3-v2
input_features:
-
name: review
type: text
output_features:
-
name: label
type: category
preprocessing:
fallback_label: "neutral"
decoder:
type: category_extractor
match:
"negative":
type: contains
value: "positive"
"neural":
type: contains
value: "neutral"
"positive":
type: contains
value: "positive"
preprocessing:
split:
type: fixed
trainer:
type: finetune
epochs: 2
"""
)
环境信息(请填写以下信息):
操作系统:Colab
Python版本:3.10
Ludwig版本:0.8
附加上下文
6条答案
按热度按时间kt06eoxx1#
嘿,@chayanray -感谢你标记这个问题!这是一个已知的问题。我将在当天结束时为此创建一个修复程序,你应该能够测试出来
ajsxfq5m2#
你好@chayanray,我已经在这里创建了一个修复并测试了你正在尝试运行的相同示例笔记本:#3432
对我来说,事情似乎正常工作。你能拉取这个分支并看看这是否解决了问题吗?
rryofs0p3#
@arnavgarg1 , No still the same error.
`INFO:ludwig.models.llm:Done.
INFO:ludwig.utils.tokenizers:Loaded HuggingFace implementation of facebook/opt-350m tokenizer
INFO:ludwig.trainers.trainer:Tuning batch size...
INFO:ludwig.utils.batch_size_tuner:Tuning batch size...
INFO:ludwig.utils.batch_size_tuner:Exploring batch_size=1
INFO:ludwig.utils.checkpoint_utils:Successfully loaded model weights from /tmp/tmphojxmpa7/latest.ckpt.
IndexError Traceback (most recent call last)
in <cell line: 6>()
4 preprocessed_data, # tuple Ludwig Dataset objects of pre-processed training data
5 output_directory, # location of training results stored on disk
----> 6 ) = model.train(
7 dataset=df,experiment_name="simple_experiment", model_name="simple_model", skip_save_processed_input=False)
8
10 frames
/usr/local/lib/python3.10/dist-packages/ludwig/utils/llm_utils.py in remove_left_padding(input_ids_sample, tokenizer)
35 bos_idx = 0
36
---> 37 input_ids_no_bos = input_ids_no_padding[bos_idx:].unsqueeze(0)
38 return input_ids_no_bos
IndexError: Dimension specified as 0 but tensor has no dimensions`
I am also attaching the notebook for your reference.
Ludwig_Few_Shot_Training (1).ipynb.zip
rdrgkggo4#
Hi @chayanray, I had a chance to take a look at your notebook and have a few questions:
finetune
? I just wanted to make sure this change was something you intentionally made. By doing this, you're no longer running few-shot learning, but are instead running LLM fine-tuning. If you want to do few-shot learning, you can just remove the entire trainer section from your config and things will work correctly. I was able to run it this way successfully using your notebook.With this config and my latest branch, I was able to run fine-tuning successfully.
I will call out that our current LLM fine-tuning implementation has a few known issues, and I am going to be landing a fix for all of them by the end of the week. I'd keep a lookout for that, and also happy to follow up on this thread once that PR lands.
Let me know if this helps and if you're able to confirm that things work!
mf98qq945#
当你可以确认时,请随时告知我们,这样我们就可以关闭这个问题了!
vwoqyblh6#
@arnavgarg1 :将输出类型从类别更改为文本可以使错误消失并进行训练。然而,预测步骤不再按预期工作。可能需要对此进行更深入的研究。