Amazon Polly
开发人员指南
AWS 文档中描述的 AWS 服务或功能可能因区域而异。要查看适用于中国区域的差异,请参阅中国的 AWS 服务入门

语音标记示例

以下语音标记请求示例显示如何发出常见请求及其生成的输出。

示例 1:没有 SSML 的语音标记

以下示例显示了您所请求元数据的简单句子在屏幕上显示的效果:“Mary had a little lamb”(玛丽有一只小羊羔)。 为简单起见,我们在此示例中未包括 SSML 语音标记。

以下 AWS CLI 示例针对 Linux、Unix 和 macOS 编排了格式。对于 Windows,请将每行末尾处的反斜杠 (\) Unix 继续符替换为脱字号 (^),并对输入文本使用双引号 ("),对内部标签使用单引号 (')。

aws polly synthesize-speech \ --output-format json \ --voice-id Joanna \ --text 'Mary had a little lamb.' \ --speech-mark-types='["viseme", "word", "sentence"]' \ MaryLamb.txt

当您发出此请求时,Amazon Polly 会在 .txt 文件中返回以下内容:

{"time":0,"type":"sentence","start":0,"end":23,"value":"Mary had a little lamb."} {"time":6,"type":"word","start":0,"end":4,"value":"Mary"} {"time":6,"type":"viseme","value":"p"} {"time":73,"type":"viseme","value":"E"} {"time":180,"type":"viseme","value":"r"} {"time":292,"type":"viseme","value":"i"} {"time":373,"type":"word","start":5,"end":8,"value":"had"} {"time":373,"type":"viseme","value":"k"} {"time":460,"type":"viseme","value":"a"} {"time":521,"type":"viseme","value":"t"} {"time":604,"type":"word","start":9,"end":10,"value":"a"} {"time":604,"type":"viseme","value":"@"} {"time":643,"type":"word","start":11,"end":17,"value":"little"} {"time":643,"type":"viseme","value":"t"} {"time":739,"type":"viseme","value":"i"} {"time":769,"type":"viseme","value":"t"} {"time":799,"type":"viseme","value":"t"} {"time":882,"type":"word","start":18,"end":22,"value":"lamb"} {"time":882,"type":"viseme","value":"t"} {"time":964,"type":"viseme","value":"a"} {"time":1082,"type":"viseme","value":"p"}

在这个输出中,文本的每个部分都由语言标记断开:

  • 句子“Mary had a little lamb。”(玛丽有一只小羊羔)

  • 文本中的每个单词:“Mary”、“had”、“a”、“little”和“lamb”。

  • 相应音频流中每个声音的语音视位:“p”、“E”、“r”、“i”等。有关语音视位的更多信息,请参阅 语音视位和 Amazon Polly

示例 2:有 SSML 的语音标志

从 SSML 增强文本生成语音标记的过程与 SSML不存在时的过程相似。使用 synthesize-speech 命令,并指定 SSML 增强文本和您所需的语音标记类型,如下例所示。为了使示例更容易读取,我们未包含语音视位语音标记,但这些可以包含在内。

以下 AWS CLI 示例针对 Linux、Unix 和 macOS 编排了格式。对于 Windows,请将每行末尾处的反斜杠 (\) Unix 继续符替换为脱字号 (^),并对输入文本使用双引号 ("),对内部标签使用单引号 (')。

aws polly synthesize-speech \ --output-format json \ --voice-id Joanna \ --text-type ssml \ --text '<speak><prosody volume="+20dB">Mary had <break time="300ms"/>a little <mark name="animal"/>lamb</prosody></speak>' \ --speech-mark-types='["sentence", "word", "ssml"]' \ output.txt

当您发出此请求时,Amazon Polly 会在 .txt 文件中返回以下内容:

{"time":0,"type":"sentence","start":31,"end":95,"value":"Mary had <break time=\"300ms\"\/>a little <mark name=\"animal\"\/>lamb"} {"time":6,"type":"word","start":31,"end":35,"value":"Mary"} {"time":325,"type":"word","start":36,"end":39,"value":"had"} {"time":897,"type":"word","start":40,"end":61,"value":"<break time=\"300ms\"\/>"} {"time":1291,"type":"word","start":61,"end":62,"value":"a"} {"time":1373,"type":"word","start":63,"end":69,"value":"little"} {"time":1635,"type":"ssml","start":70,"end":91,"value":"animal"} {"time":1635,"type":"word","start":91,"end":95,"value":"lamb"}