php Google Cloud TTS引擎忽略大块文本上的break SSML标签

pgky5nke  于 2023-05-05  发布在  PHP
关注(0)|答案(2)|浏览(140)

我试图从一个大段落中生成音频,但Google忽略了SSML标签。
如果我将文本修剪到三行左右,则标记会受到尊重,但超过这一点会使它们不可见。

$body = [
  'input' => [
    'ssml' => '<speak><p>Client Acquisition Engine Funnel For Creative Professionals</p><break time="3000ms"/><p>Most creative professionals I know are great at what they do, they have happy customers and deliver a high standard of work. Most however aren\'t very adept at acquiring new clients and have fallen into the habit of relying too much on referrals.</p><break time="3000ms"/><p>I\'ll be honest here, I used to do that as well.</p><break time="3000ms"/><p>For a long time, I just thought: "Well, if keep turning out good work I\'ll keep getting referrals"?</p><break time="3000ms"/><p>Turns out, that\'s not the case at all. Relying on referrals is a highly risky \'strategy\' and is not recommended (I wrote about it here).</p><break time="3000ms"/><p>Deep down somewhere in your conscious you already know this and so every now and then you throw some money into Facebook & Google Ads and after burning through a fair amount cash and bearing little fruit, decide its not for you and carry on.</p><break time="3000ms"/><p>Also, relying on the 20% of your clients who bring in 80% of your revenue (those 1 or 2 hero long term clients who\'ve helped sustain your freelance business for years) is also a highly risky situation to be in. You wake in the middle of the night, every now and then, in a cold sweat.. they\'ve left you!</p><break time="3000ms"/><p>Phew! It was only a dream...</p><break time="3000ms"/><p>I was exactly like that, but then I started to develop and systemise my sales & marketing processes. So that would be able to build a predictable lead flow, attracting more ideal prospects, delivering value, pre-qualifying leads, inviting those with highest engagement to book time on my calendar, all the while automating 80% of that process.</p><break time="3000ms"/><p>The key is to understand your own sales process, breaking each of the stages down into steps and look to automating as much of the repetitive tasks as you can. Doing so will result in being able to attract and manage more leads, filter out the ideal customers, which in turn gives you me more time, money and freedom.</p><break time="3000ms"/><p>Sounds complicated, and if you don\'t have a clear pathway, it can quickly end up turning into a rabbit hole.</p><break time="3000ms"/><p>But you can stop faffing about and get on the right track by downloading the Client Acquisition Engine Funnel Map & Toolkit today.</p><break time="3000ms"/></speak>',
  ],
  'voice' => ['language_code' => 'en-US', 'name' => 'en-US-Wavenet-D'],
  'audioConfig' => ['audio_encoding' => 'MP3'],
];

$result = json_decode($client->post('https://texttospeech.googleapis.com/v1beta1/text:synthesize?key=[REDACTED]', $body)->getBody());

file_put_contents('test.mp3', base64_decode($result['audioContent']));

请求成功通过,但缺少中断标记。这不仅可以通过代码复制,而且可以在服务网站上的公共Playground上复制。

pbpqsu0x

pbpqsu0x1#

请打开一个Issue Tracker,如果你认为这是一个错误,以便文本到语音系统工程师可以修复你遇到的SSML标签的问题。

a8jjtwal

a8jjtwal2#

我也打了。我了解到,如果SSML标记有任何类型的错误,它们都被忽略,没有任何类型的错误处理。真的无助的谷歌不提供反馈,或文档解释说,这种情况发生,但这是谷歌为你。我的解决方案:

  • 转义不允许的字符(&、“、'、〈、〉)的输入文本。一个快速的正则表达式替换应该可以完成任务。
  • 确保ssml标记中的“标记使用\字符转义

例如,这将不起作用:

<speak>The man said "hi". <break time="2s" /> I smiled.</speak>

但这应该:

<speak>The man said &quot;hi&quot;. <break time=\"2s\" /> I smiled.</speak>

相关问题