描述bug
在教程中找到的:https://ludwig.ai/latest/examples/nlu/
并使用相同的nlu数据集:ludwig experiment --dataset nlu.csv --config config.yaml
nlu.csvconfig.yaml
input_features:
-
name: utterance
type: text
encoder:
type: rnn
cell_type: lstm
bidirectional: true
num_layers: 2
reduce_output: null
preprocessing:
tokenizer: space
output_features:
-
name: intent
type: category
reduce_input: sum
decoder:
num_fc_layers: 1
output_size: 64
-
name: slots
type: sequence
decoder:
type: tagger
error
:
.
.
.
File "/home/martin/.virtualenvs/ludwig/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "/home/martin/.virtualenvs/ludwig/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (2x512 and 256x256)
Training: 0%|
9条答案
按热度按时间mefy6pfw1#
我们正在观察。
kmynzznz2#
你好,@martinb-bb
感谢你提出这个问题!
形状不匹配表明有些模块没有正确初始化。我不认为这是一个复杂的修复问题,我们应该用一个更大的示例替换这个示例。
下个季度,我们计划将所有示例和教程与作为我们的CI一部分运行的实际测试相匹配,这应该有助于保持示例的完整性。
我计划在下周更仔细地查看这个问题。
pokxtpni3#
非常感谢@justinxzhao!我会继续关注更新的。
gdrx4gfi4#
抱歉给大家带来不便,@martinb-bb。
我能够在Ludwig v0.6.4上复现这个问题,但在v0.7上似乎已经修复了,训练工作正常进行。
您是否能够从头开始为您的实验安装ludwig?
inb24sb25#
@justinxzhao Works like a charm! Thanks for the help. 😄
One last question:
How could I get this NLU model to train and infer using pure python API not CLI? To intergrate this library into our terminal, I need to be able to do everything with pure python.
(I would look through the docs, but there is nothing on that NLU page about that. Thanks!)
kgqe7b3p6#
你好,@martinb-bb,这里有一个你可以使用的Python脚本示例:
a6b3iqyw7#
@justinxzhao 太好了!谢谢你🙏😊
我想问最后一个问题:
一旦模型训练完成,用于推理目的时,你会如何使用它?
因为你给它一个字符串作为输入,并期望收到意图+槽位的回复,有什么特别的吗?
这应该回答了我所有的问题!
提前感谢你😊
sg24os4d8#
你好@martinb-bb,
根据你的需求,你可以尝试以下几种部署选项:
在这两种情况下,用户都应该准备好提供一个字符串作为输入,并接收回意图+槽位。
根据你的质量要求,使用两个单一任务模型而不是一个多任务模型进行性能基准测试可能是值得的。
9w11ddsr9#
@justinxzhao Thanks for your input. All is working now!
I have one last clarification to make about my model output.
Input:
Output:
What is confusing me is that the [slots] has more values than the input? The intent works great but my slots does not match up 1:1.
Input = 10 words
,output=array of 12 items
with a doubleEOS
statement.Ignoring all slot classifications, is there a reason why this happens? Perhaps I am overlooking something. (i have attached the dataset in case you want to check it out.)
Very much appreciate your support!
nlu.csv