Getting started with Amazon Transcribe
Before you can create transcriptions, you have a few prerequisites:
Install the Amazon CLI and SDKs (if you're using the Amazon Web Services Management Console for your transcriptions, you can skip this step)
Once you complete these prerequisites, you're ready to transcribe. Select your preferred transcription method from the following list to get started.
Tip
If you're new to Amazon Transcribe or would like to explore our features, we recommend using
the Amazon Web Services Management Console
Because streaming using HTTP/2 and WebSockets is more complicated than the other transcription methods, we recommend reviewing the Setting up a streaming transcription section before getting started with these methods. Note that we strongly recommend using an SDK for streaming transcriptions.
Signing up for an Amazon Web Services account
You can sign up for a free tier
Tip
When setting up your account, make note of your Amazon Web Services account ID because you need it to create IAM entities.
Installing the Amazon CLI and SDKs
To use the Amazon Transcribe API, you must first install the Amazon CLI. The current Amazon CLI is version 2. You can find installation instructions for Linux, Mac, Windows, and Docker in the Amazon Command Line Interface User Guide.
Once you have the Amazon CLI installed, you must configure it for your security credentials and Amazon Web Services Region.
If you want to use Amazon Transcribe with an SDK, select your preferred language for installation instructions:
Configure IAM credentials
When you create an Amazon Web Services account, you begin with one sign-in identity that has complete access to all Amazon services and resources in your account. This identity is called the Amazon Web Services account root user and is accessed by signing in with the email address and password that you used to create the account.
We strongly recommend that you do not use the root user for your everyday tasks. Safeguard your root user credentials and use them to perform the tasks that only the root user can perform.
As a best practice, require users—including those that require administrator access—to use federation with an identity provider to access Amazon services by using temporary credentials.
A federated identity is any user who accesses Amazon services by using credentials provided through an identity source. When federated identities access Amazon Web Services accounts, they assume roles, and the roles provide temporary credentials.
For centralized access management, we recommend that you use Amazon IAM Identity Center. You can create users and groups in IAM Identity Center. Or you can connect and synchronize to a set of users and groups in your own identity source for use across all your Amazon Web Services accounts and applications. For more information, see Identity and Access Management for Amazon Transcribe.
To learn more about IAM best practices, refer to Security best practices in IAM.
Creating an Amazon S3 bucket
Amazon S3 is a secure object storage service. Amazon S3 stores your files (called objects) in containers (called buckets).
To run a batch transcription, you must first upload your media files into an Amazon S3 bucket. If you don't specify an Amazon S3 bucket for your transcription output, Amazon Transcribe puts your transcript in a temporary Amazon-managed Amazon S3 bucket. Transcription output in Amazon-managed buckets is automatically deleted after 90 days.
Learn how to Create your first S3 bucket and Upload an object to your bucket.
Creating an IAM policy
To manage access in Amazon, you must create policies and attach them to IAM identities (users, groups, or roles) or Amazon resources. A policy defines the permissions of the entity it is attached to. For example, a role can only access a media file located in your Amazon S3 bucket if you've attached a policy to that role which grants it access. If you want to further restrict that role, you can instead limit its access to a specific file within an Amazon S3 bucket.
To learn more about using Amazon policies see:
For example policies you can use with Amazon Transcribe, see
Amazon Transcribe identity-based policy
examples. If you want to generate custom
policies, consider using the
Amazon
Policy Generator
You can add a policy using the Amazon Web Services Management Console, Amazon CLI, or Amazon SDK. For instructions, see Adding and removing IAM identity permissions.
Policies have the format:
{ "Version": "2012-10-17", "Statement": [ { "Sid": "
my-policy-name
", "Effect": "Allow", "Action": [ "service
:action
" ], "Resource": [ "amazon-resource-name
" ] } ] }
Amazon Resource Names (ARNs) uniquely identify all Amazon resources, such as an
Amazon S3 bucket. You can use ARNs in your policy to grant permissions for specific actions to
use specific resources. For example, if you want to grant read access to an Amazon S3 bucket
and its sub-folders, you can add the following code to your trust policy's Statement
section:
{ "Effect": "Allow", "Action": [ "s3:GetObject", "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::
DOC-EXAMPLE-BUCKET
", "arn:aws:s3:::DOC-EXAMPLE-BUCKET/*
" ] }
Here's an example policy that grants Amazon Transcribe read (GetObject
,
ListBucket
) and write (PutObject
) permissions to an
Amazon S3 bucket, DOC-EXAMPLE-BUCKET
, and its sub-folders:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:GetObject", "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::
DOC-EXAMPLE-BUCKET
", "arn:aws:s3:::DOC-EXAMPLE-BUCKET/*
" ] }, { "Effect": "Allow", "Action": [ "s3:PutObject" ], "Resource": [ "arn:aws:s3:::DOC-EXAMPLE-BUCKET
", "arn:aws:s3:::DOC-EXAMPLE-BUCKET/*
" ] } ] }