Use StartSpeechSynthesisTask with an Amazon SDK or CLI
The following code examples show how to use StartSpeechSynthesisTask.
- CLI
-
- Amazon CLI
-
To synthesize text
The following
start-speech-synthesis-taskexample synthesizes the text intext_file.txtand stores the resulting MP3 file in the specified bucket.aws polly start-speech-synthesis-task \ --output-formatmp3\ --output-s3-bucket-nameamzn-s3-demo-bucket\ --textfile://text_file.txt\ --voice-idJoannaOutput:
{ "SynthesisTask": { "TaskId": "70b61c0f-57ce-4715-a247-cae8729dcce9", "TaskStatus": "scheduled", "OutputUri": "https://s3.us-east-2.amazonaws.com/amzn-s3-demo-bucket/70b61c0f-57ce-4715-a247-cae8729dcce9.mp3", "CreationTime": 1603911042.689, "RequestCharacters": 1311, "OutputFormat": "mp3", "TextType": "text", "VoiceId": "Joanna" } }For more information, see Creating long audio files
in the Amazon Polly Developer Guide. -
For API details, see StartSpeechSynthesisTask
in Amazon CLI Command Reference.
-
- Python
-
- SDK for Python (Boto3)
-
Note
There's more on GitHub. Find the complete example and learn how to set up and run in the Amazon Code Examples Repository
. class PollyWrapper: """Encapsulates Amazon Polly functions.""" def __init__(self, polly_client, s3_resource): """ :param polly_client: A Boto3 Amazon Polly client. :param s3_resource: A Boto3 Amazon Simple Storage Service (Amazon S3) resource. """ self.polly_client = polly_client self.s3_resource = s3_resource self.voice_metadata = None def do_synthesis_task( self, text, engine, voice, audio_format, s3_bucket, lang_code=None, include_visemes=False, wait_callback=None, ): """ Start an asynchronous task to synthesize speech or speech marks, wait for the task to complete, retrieve the output from Amazon S3, and return the data. An asynchronous task is required when the text is too long for near-real time synthesis. :param text: The text to synthesize. :param engine: The kind of engine used. Can be standard or neural. :param voice: The ID of the voice to use. :param audio_format: The audio format to return for synthesized speech. When speech marks are synthesized, the output format is JSON. :param s3_bucket: The name of an existing Amazon S3 bucket that you have write access to. Synthesis output is written to this bucket. :param lang_code: The language code of the voice to use. This has an effect only when a bilingual voice is selected. :param include_visemes: When True, a second request is made to Amazon Polly to synthesize a list of visemes, using the specified text and voice. A viseme represents the visual position of the face and mouth when saying part of a word. :param wait_callback: A callback function that is called periodically during task processing, to give the caller an opportunity to take action, such as to display status. :return: The audio stream that contains the synthesized speech and a list of visemes that are associated with the speech audio. """ try: kwargs = { "Engine": engine, "OutputFormat": audio_format, "OutputS3BucketName": s3_bucket, "Text": text, "VoiceId": voice, } if lang_code is not None: kwargs["LanguageCode"] = lang_code response = self.polly_client.start_speech_synthesis_task(**kwargs) speech_task = response["SynthesisTask"] logger.info("Started speech synthesis task %s.", speech_task["TaskId"]) viseme_task = None if include_visemes: kwargs["OutputFormat"] = "json" kwargs["SpeechMarkTypes"] = ["viseme"] response = self.polly_client.start_speech_synthesis_task(**kwargs) viseme_task = response["SynthesisTask"] logger.info("Started viseme synthesis task %s.", viseme_task["TaskId"]) except ClientError: logger.exception("Couldn't start synthesis task.") raise else: bucket = self.s3_resource.Bucket(s3_bucket) audio_stream = self._wait_for_task( 10, speech_task["TaskId"], "speech", wait_callback, bucket ) visemes = None if include_visemes: viseme_data = self._wait_for_task( 10, viseme_task["TaskId"], "viseme", wait_callback, bucket ) visemes = [ json.loads(v) for v in viseme_data.read().decode().split() if v ] return audio_stream, visemes-
For API details, see StartSpeechSynthesisTask in Amazon SDK for Python (Boto3) API Reference.
-
- SAP ABAP
-
- SDK for SAP ABAP
-
Note
There's more on GitHub. Find the complete example and learn how to set up and run in the Amazon Code Examples Repository
. TRY. " Only pass optional parameters if they have values IF iv_lang_code IS NOT INITIAL AND iv_s3_key_prefix IS NOT INITIAL. oo_result = lo_ply->startspeechsynthesistask( iv_engine = iv_engine iv_outputformat = iv_audio_format iv_outputs3bucketname = iv_s3_bucket iv_outputs3keyprefix = iv_s3_key_prefix iv_text = iv_text iv_voiceid = iv_voice_id iv_languagecode = iv_lang_code ). ELSEIF iv_lang_code IS NOT INITIAL. oo_result = lo_ply->startspeechsynthesistask( iv_engine = iv_engine iv_outputformat = iv_audio_format iv_outputs3bucketname = iv_s3_bucket iv_text = iv_text iv_voiceid = iv_voice_id iv_languagecode = iv_lang_code ). ELSEIF iv_s3_key_prefix IS NOT INITIAL. oo_result = lo_ply->startspeechsynthesistask( iv_engine = iv_engine iv_outputformat = iv_audio_format iv_outputs3bucketname = iv_s3_bucket iv_outputs3keyprefix = iv_s3_key_prefix iv_text = iv_text iv_voiceid = iv_voice_id ). ELSE. oo_result = lo_ply->startspeechsynthesistask( iv_engine = iv_engine iv_outputformat = iv_audio_format iv_outputs3bucketname = iv_s3_bucket iv_text = iv_text iv_voiceid = iv_voice_id ). ENDIF. MESSAGE 'Speech synthesis task started.' TYPE 'I'. CATCH /aws1/cx_plyinvalids3bucketex. MESSAGE 'Invalid S3 bucket.' TYPE 'E'. CATCH /aws1/cx_plyinvalidssmlex. MESSAGE 'Invalid SSML.' TYPE 'E'. CATCH /aws1/cx_plylexiconnotfoundex. MESSAGE 'Lexicon not found.' TYPE 'E'. CATCH /aws1/cx_plyservicefailureex. MESSAGE 'Service failure occurred.' TYPE 'E'. CATCH /aws1/cx_plytextlengthexcdex. MESSAGE 'Text length exceeded maximum.' TYPE 'E'. ENDTRY.-
For API details, see StartSpeechSynthesisTask in Amazon SDK for SAP ABAP API reference.
-
For a complete list of Amazon SDK developer guides and code examples, see Using Amazon Polly with an Amazon SDK. This topic also includes information about getting started and details about previous SDK versions.