Step 2: Launch an Amazon EMR cluster
In this step, you will configure and launch an Amazon EMR cluster. Hive and a storage handler for DynamoDB will already be installed on the cluster.
Open the Amazon EMR console at https://console.amazonaws.cn/emr
. -
Choose Create Cluster.
-
On the Create Cluster - Quick Options page, do the following:
-
In Cluster name, type a name for your cluster (for example:
My EMR cluster
). -
In EC2 key pair, choose the key pair you created earlier.
Leave the other settings at their defaults.
-
-
Choose Create cluster.
It will take several minutes to launch your cluster. You can use the Cluster Details page in the Amazon EMR console to monitor its progress.
When the status changes to Waiting
, the cluster is ready for
use.
Cluster log files and Amazon S3
An Amazon EMR cluster generates log files that contain information about the cluster status and debugging information. The default settings for Create Cluster - Quick Options include setting up Amazon EMR logging.
If one does not already exist, the Amazon Web Services Management Console creates an Amazon S3 bucket. The
bucket name is
aws-logs-
,
where account-id
-region
is your Amazon account number
and account-id
is the region in which you
launched the cluster (for example,
region
aws-logs-123456789012-us-west-2
).
Note
You can use the Amazon S3 console to view the log files. For more information, see View Log Files in the Amazon EMR Management Guide.
You can use this bucket for purposes in addition to logging. For example, you can use the bucket as a location for storing a Hive script or as a destination when exporting data from Amazon DynamoDB to Amazon S3.
Next step
Step 3: Connect to the Leader node