Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions,
see Getting Started with Amazon Web Services in China
(PDF).
Create a long-lived Amazon EMR cluster and run several steps using an Amazon SDK
The following code example shows how to create a long-lived Amazon EMR cluster and run several steps.
- Python
-
- SDK for Python (Boto3)
-
Create a long-lived Amazon EMR cluster that uses Apache Spark to query
historical Amazon review data from the
Amazon Customer Reviews Dataset.
Run a job that gets data for top-rated products in specific categories that contain
keywords in their product titles. Job results are written to an Amazon Simple Storage Service (Amazon S3) bucket.
Create an Amazon S3 bucket and upload a job script.
Create Amazon Identity and Access Management (IAM) roles.
Create Amazon Elastic Compute Cloud (Amazon EC2) security groups.
Create a long-lived cluster and run several job steps.
This example is best viewed on GitHub. For complete source code
and instructions on how to set up and run, see the full example on
GitHub.
Services used in this example
Amazon EC2
Amazon EMR
IAM
Amazon S3
For a complete list of Amazon SDK developer guides and code examples, see
Using IAM with an Amazon SDK.
This topic also includes information about getting started and details about previous SDK versions.