allennlp Set Layers that are freezed to eval mode in BERT during training

mhd8tkvw  于 6个月前  发布在  其他
关注(0)|答案(2)|浏览(71)

你好

我想知道在训练过程中,当我们将PretrainedBert的requires_grad参数设置为False时,是否有一种方法可以将BERT中的dropout和layer-norm层切换到评估模式。

在allennlp/allennlp/modules/token_embedders/bert_token_embedder.py的第294行中,有一个forparaminmodel.parameters():循环,但是我发现即使我修改了上面的初始化代码,将其设置为评估模型,allenlp训练器循环仍然会在整个模型上调用model.train(),并将BERT切换回训练模式。

我目前的设置是在模型的前向调用过程中将层设置为评估模式。是否有更好的方法?

oxiaedzo

oxiaedzo1#

根据讨论主题,我将此设置为欢迎贡献。

cgyqldqp

cgyqldqp2#

Putting in my comment from the discourse thread:
Hmm, sounds like we’d want to override model.train() to handle this properly. It also sounds a bit messy to get correct for everything, but if you can think of a clean solution, I think this is definitely a problem that we’d want to fix in the library. Feel free to open an issue about this in the repo, and I’ll mark it as “contributions welcome”.
You should be able to override train() on your own model class, also. That would be a good way to test this to see if it’s possible to do it in a clean way that will generalize to other models. If you can, then a PR to add it to the base model class would be lovely.

相关问题