Sample commands to execute EMR Notebooks programmatically - Amazon EMR
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Sample commands to execute EMR Notebooks programmatically


EMR Notebooks are available as EMR Studio Workspaces in the console. The Create Workspace button in the console lets you create new notebooks. To access or create Workspaces, EMR Notebooks users need additional IAM role permissions. For more information, see Amazon EMR Notebooks are Amazon EMR Studio Workspaces in the console and Amazon EMR console.


You can execute EMR notebooks with execution APIs from a script or from command line. When you start, stop, list, and describe EMR notebook executions outside of the Amazon console, you can programmatically control an EMR notebook. You can pass different parameter values to a notebook with a parameterized notebook cell. This eliminates the need to create a copy of the notebook for each new set of parameter values. For more information, see Amazon EMR API actions.

You can schedule or batch EMR notebook executions with Amazon CloudWatch events and Amazon Lambda. For more information, see Using Amazon Lambda with Amazon CloudWatch Events.

Role permissions for programmatic execution

To use programmatic execution with EMR Notebooks, you must configure user permissions with the following policies:

{ "Version": "2012-10-17", "Statement": [ { "Sid": "AllowExecutionActions", "Effect": "Allow", "Action": [ "elasticmapreduce:StartNotebookExecution", "elasticmapreduce:DescribeNotebookExecution", "elasticmapreduce:ListNotebookExecutions" ], "Resource": "*" }, { "Sid": "AllowPassingServiceRole", "Effect": "Allow", "Action": [ "iam:PassRole" ], "Resource": "arn:aws:iam::account-id:role/EMR_Notebooks_DefaultRole" } ] }

When you programmatically execute EMR Notebooks on an EMR Notebooks cluster, you must add these additional permissions:

{ "Version": "2012-10-17", "Statement": [ { "Sid": "AllowRetrievingManagedEndpointCredentials", "Effect": "Allow", "Action": [ "emr-containers:GetManagedEndpointSessionCredentials" ], "Resource": [ "arn:aws:emr-containers:region:account-id:/virtualclusters/virtual-cluster-id/endpoints/managed-endpoint-id" ], "Condition": { "StringEquals": { "emr-containers:ExecutionRoleArn": [ "arn:aws:iam::account-id:role/emr-on-eks-execution-role" ] } } }, { "Sid": "AllowDescribingManagedEndpoint", "Effect": "Allow", "Action": [ "emr-containers:DescribeManagedEndpoint" ], "Resource": [ "arn:aws:emr-containers:region:account-id:/virtualclusters/virtual-cluster-id/endpoints/managed-endpoint-id" ] } ] }

Limitations with programmatic execution

  • A maximum of 100 concurrent executions are supported per Amazon Web Services Region per account.

  • An execution is terminated if it runs for more than 30 days.

  • Programmatic execution of notebooks isn't supported with Amazon EMR Serverless interactive applications.

Examples of programmatic EMR notebook execution

The following sections provide several examples of programmatic EMR notebook execution with the Amazon CLI, Boto3 SDK (Python), and Ruby:

You can also run parameterized notebooks as part of scheduled workflows with an orchestration tool such as Apache Airflow or Amazon Managed Workflows for Apache Airflow (MWAA). For more information, see Orchestrating analytics jobs on EMR Notebooks using MWAA in the Amazon Big Data Blog.