StartSpeechSynthesisTask与 Amazon SDK 或 CLI 配合使用 - Amazon Polly
Amazon Web Services 文档中描述的 Amazon Web Services 服务或功能可能因区域而异。要查看适用于中国区域的差异,请参阅 中国的 Amazon Web Services 服务入门 (PDF)

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

StartSpeechSynthesisTask与 Amazon SDK 或 CLI 配合使用

以下代码示例演示如何使用 StartSpeechSynthesisTask

CLI
Amazon CLI

合成文本

以下start-speech-synthesis-task示例合成了中的文本,text_file.txt并将生成的 MP3 文件存储在指定的存储桶中。

aws polly start-speech-synthesis-task \ --output-format mp3 \ --output-s3-bucket-name amzn-s3-demo-bucket \ --text file://text_file.txt \ --voice-id Joanna

输出:

{ "SynthesisTask": { "TaskId": "70b61c0f-57ce-4715-a247-cae8729dcce9", "TaskStatus": "scheduled", "OutputUri": "https://s3.us-east-2.amazonaws.com/amzn-s3-demo-bucket/70b61c0f-57ce-4715-a247-cae8729dcce9.mp3", "CreationTime": 1603911042.689, "RequestCharacters": 1311, "OutputFormat": "mp3", "TextType": "text", "VoiceId": "Joanna" } }

有关更多信息,请参阅《Amazon Polly 开发人员指南》中的创建长音频文件

Python
适用于 Python 的 SDK(Boto3)
注意

还有更多相关信息 GitHub。在 Amazon 代码示例存储库中查找完整示例,了解如何进行设置和运行。

class PollyWrapper: """Encapsulates Amazon Polly functions.""" def __init__(self, polly_client, s3_resource): """ :param polly_client: A Boto3 Amazon Polly client. :param s3_resource: A Boto3 Amazon Simple Storage Service (Amazon S3) resource. """ self.polly_client = polly_client self.s3_resource = s3_resource self.voice_metadata = None def do_synthesis_task( self, text, engine, voice, audio_format, s3_bucket, lang_code=None, include_visemes=False, wait_callback=None, ): """ Start an asynchronous task to synthesize speech or speech marks, wait for the task to complete, retrieve the output from Amazon S3, and return the data. An asynchronous task is required when the text is too long for near-real time synthesis. :param text: The text to synthesize. :param engine: The kind of engine used. Can be standard or neural. :param voice: The ID of the voice to use. :param audio_format: The audio format to return for synthesized speech. When speech marks are synthesized, the output format is JSON. :param s3_bucket: The name of an existing Amazon S3 bucket that you have write access to. Synthesis output is written to this bucket. :param lang_code: The language code of the voice to use. This has an effect only when a bilingual voice is selected. :param include_visemes: When True, a second request is made to Amazon Polly to synthesize a list of visemes, using the specified text and voice. A viseme represents the visual position of the face and mouth when saying part of a word. :param wait_callback: A callback function that is called periodically during task processing, to give the caller an opportunity to take action, such as to display status. :return: The audio stream that contains the synthesized speech and a list of visemes that are associated with the speech audio. """ try: kwargs = { "Engine": engine, "OutputFormat": audio_format, "OutputS3BucketName": s3_bucket, "Text": text, "VoiceId": voice, } if lang_code is not None: kwargs["LanguageCode"] = lang_code response = self.polly_client.start_speech_synthesis_task(**kwargs) speech_task = response["SynthesisTask"] logger.info("Started speech synthesis task %s.", speech_task["TaskId"]) viseme_task = None if include_visemes: kwargs["OutputFormat"] = "json" kwargs["SpeechMarkTypes"] = ["viseme"] response = self.polly_client.start_speech_synthesis_task(**kwargs) viseme_task = response["SynthesisTask"] logger.info("Started viseme synthesis task %s.", viseme_task["TaskId"]) except ClientError: logger.exception("Couldn't start synthesis task.") raise else: bucket = self.s3_resource.Bucket(s3_bucket) audio_stream = self._wait_for_task( 10, speech_task["TaskId"], "speech", wait_callback, bucket ) visemes = None if include_visemes: viseme_data = self._wait_for_task( 10, viseme_task["TaskId"], "viseme", wait_callback, bucket ) visemes = [ json.loads(v) for v in viseme_data.read().decode().split() if v ] return audio_stream, visemes
SAP ABAP
适用于 SAP ABAP 的 SDK
注意

还有更多相关信息 GitHub。在 Amazon 代码示例存储库中查找完整示例,了解如何进行设置和运行。

TRY. " Only pass optional parameters if they have values IF iv_lang_code IS NOT INITIAL AND iv_s3_key_prefix IS NOT INITIAL. oo_result = lo_ply->startspeechsynthesistask( iv_engine = iv_engine iv_outputformat = iv_audio_format iv_outputs3bucketname = iv_s3_bucket iv_outputs3keyprefix = iv_s3_key_prefix iv_text = iv_text iv_voiceid = iv_voice_id iv_languagecode = iv_lang_code ). ELSEIF iv_lang_code IS NOT INITIAL. oo_result = lo_ply->startspeechsynthesistask( iv_engine = iv_engine iv_outputformat = iv_audio_format iv_outputs3bucketname = iv_s3_bucket iv_text = iv_text iv_voiceid = iv_voice_id iv_languagecode = iv_lang_code ). ELSEIF iv_s3_key_prefix IS NOT INITIAL. oo_result = lo_ply->startspeechsynthesistask( iv_engine = iv_engine iv_outputformat = iv_audio_format iv_outputs3bucketname = iv_s3_bucket iv_outputs3keyprefix = iv_s3_key_prefix iv_text = iv_text iv_voiceid = iv_voice_id ). ELSE. oo_result = lo_ply->startspeechsynthesistask( iv_engine = iv_engine iv_outputformat = iv_audio_format iv_outputs3bucketname = iv_s3_bucket iv_text = iv_text iv_voiceid = iv_voice_id ). ENDIF. MESSAGE 'Speech synthesis task started.' TYPE 'I'. CATCH /aws1/cx_plyinvalids3bucketex. MESSAGE 'Invalid S3 bucket.' TYPE 'E'. CATCH /aws1/cx_plyinvalidssmlex. MESSAGE 'Invalid SSML.' TYPE 'E'. CATCH /aws1/cx_plylexiconnotfoundex. MESSAGE 'Lexicon not found.' TYPE 'E'. CATCH /aws1/cx_plyservicefailureex. MESSAGE 'Service failure occurred.' TYPE 'E'. CATCH /aws1/cx_plytextlengthexcdex. MESSAGE 'Text length exceeded maximum.' TYPE 'E'. ENDTRY.

有关 S Amazon DK 开发者指南和代码示例的完整列表,请参阅将 Amazon Polly 与 Amazon SDK 结合使用。本主题还包括有关入门的信息以及有关先前的 SDK 版本的详细信息。