链接: https://pan.baidu.com/s/1is8yodQ5QmgsIyLogKF24w?pwd=5yf9
提取码:5yf9
训练时采用的是java里的库来先对所有日语的汉字转换为片假名,然后将片假名转换为罗马音来进行输入...
使用时需要先用将所有日语转换为片假名...然后转为对应的罗马音输入就能合成了
效果:
日语对应的罗马音:
我使用的时候是前几个单词或最后几个单词可以正确识别并且效果还不错,但中间的单词不太好识别出来。而这种情况从一开始的18k完全收敛到95k也没有明显改善,不知道是因为数据集还不够大还是训练的时候哪里出错的原因。还请有相关训练经验的指教一下
2条答案
按热度按时间8ehkhllq1#
是不是还要改什么文件啊,直接用这个pt,会报错
RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([66, 512]) from checkpoint, the shape in current model is torch.Size([75, 512]). size mismatch for encoder_proj.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([128, 1024]). size mismatch for decoder.attn_rnn.weight_ih: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([384, 1280]). size mismatch for decoder.rnn_input.weight: copying a param with shape torch.Size([1024, 640]) from checkpoint, the shape in current model is torch.Size([1024, 1152]). size mismatch for decoder.stop_proj.weight: copying a param with shape torch.Size([1, 1536]) from checkpoint, the shape in current model is torch.Size([1, 2048]).
Traceback:
File "I:\MockingBird\venv\lib\site-packages\streamlit\scriptrunner\script_runner.py", line 443, in run_script
exec(code, module.dict)
File "C:\Users\Naught\AppData\Local\Temp\tmpedw34yc.py", line 14, in
render_streamlit_ui()
File "I:\MockingBird\MockingBird\mkgui\base\ui\streamlit_ui.py", line 864, in render_streamlit_ui
session_state.output_data = opyrator(input=input_data_obj)
File "I:\MockingBird\MockingBird\mkgui\base\core.py", line 203, in call
return self.function(input_obj, **kwargs)
File "I:\MockingBird\MockingBird\mkgui\app.py", line 134, in synthesize
specs = current_synt.synthesize_spectrograms(texts, embeds)
File "I:\MockingBird\MockingBird\synthesizer\inference.py", line 93, in synthesize_spectrograms
self.load()
File "I:\MockingBird\MockingBird\synthesizer\inference.py", line 71, in load
self._model.load(self.model_fpath, self.device)
File "I:\MockingBird\MockingBird\synthesizer\models\base.py", line 51, in load
self.load_state_dict(checkpoint["model_state"], strict=False)
File "I:\MockingBird\venv\lib\site-packages\torch\nn\modules\module.py", line 1671, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
pes8fvy92#
是不是还要改什么文件啊,直接用这个pt,会报错 RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([66, 512]) from checkpoint, the shape in current model is torch.Size([75, 512]). size mismatch for encoder_proj.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([128, 1024]). size mismatch for decoder.attn_rnn.weight_ih: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([384, 1280]). size mismatch for decoder.rnn_input.weight: copying a param with shape torch.Size([1024, 640]) from checkpoint, the shape in current model is torch.Size([1024, 1152]). size mismatch for decoder.stop_proj.weight: copying a param with shape torch.Size([1, 1536]) from checkpoint, the shape in current model is torch.Size([1, 2048]). Traceback: File "I:\MockingBird\venv\lib\site-packages\streamlit\scriptrunner\script_runner.py", line 443, in run_script exec(code, module.dict) File "C:\Users\Naught\AppData\Local\Temp\tmpedw34yc.py", line 14, in render_streamlit_ui() File "I:\MockingBird\MockingBird\mkgui\base\ui\streamlit_ui.py", line 864, in render_streamlit_ui session_state.output_data = opyrator(input=input_data_obj) File "I:\MockingBird\MockingBird\mkgui\base\core.py", line 203, in call return self.function(input_obj, **kwargs) File "I:\MockingBird\MockingBird\mkgui\app.py", line 134, in synthesize specs = current_synt.synthesize_spectrograms(texts, embeds) File "I:\MockingBird\MockingBird\synthesizer\inference.py", line 93, in synthesize_spectrograms self.load() File "I:\MockingBird\MockingBird\synthesizer\inference.py", line 71, in load self._model.load(self.model_fpath, self.device) File "I:\MockingBird\MockingBird\synthesizer\models\base.py", line 51, in load self.load_state_dict(checkpoint["model_state"], strict=False) File "I:\MockingBird\venv\lib\site-packages\torch\nn\modules\module.py", line 1671, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
可能是Batchsize之类的设置问题?好像python版本不一样也会有各种问题,其实我这个模型的效果连一般的水平也没达到,当初我也只是稍微跑了一下而已。如果是感兴趣的话还可以自己尝试训练一下,但如果是为了获得比较好的日语tts的话还是换一个已经有现成更好的项目的基础上训练更好,这个项目主要还是英文和汉语做的人多,想一个人做日语类的感觉比较难。