Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions,
see Getting Started with Amazon Web Services in China
(PDF).
Running jobs from the EMR Studio console
You can submit job runs to EMR Serverless applications and view the jobs from the
EMR Studio console. To create or navigate to your EMR Serverless application on the
EMR Studio console, follow the instructions in Getting started
from the console.
Submit a job
On the Submit job page, you can submit a job to an EMR Serverless
application as follows.
- Spark
-
-
In the Name field, enter a name for your job run.
-
In the Runtime role field, enter the name of the IAM
role that your EMR Serverless application can assume for the job run. To learn
more about runtime roles, see Job runtime roles for Amazon EMR Serverless.
-
In the Script location field, enter the Amazon S3 location for
the script or JAR that you want to run. For Spark jobs, the script can be a Python
(.py
) file or a JAR (.jar
) file.
-
If your script location is a JAR file, enter the class name that is the entry
point for the job in the Main class field.
-
(Optional) Enter values for the remaining fields.
-
Script arguments — Enter any arguments that
you want to pass to your main JAR or Python script. Your code reads these
parameters. Separate each argument in the array by a comma.
-
Spark properties — Expand the Spark properties
section and enter any Spark configuration parameters in this field.
If you specify Spark driver and executor sizes, you must take memory
overhead into account. Specify memory overhead values in the properties
spark.driver.memoryOverhead
and
spark.executor.memoryOverhead
. Memory overhead has a default
value of 10% of container memory, with a minimum of 384 MB. The executor
memory and the memory overhead together can't exceed the worker memory. For
example, the maximum spark.executor.memory
on a 30 GB worker
must be 27 GB.
-
Job configuration — Specify any job
configuration in this field. You can use these job configurations to override
the default configurations for applications.
-
Additional settings — Active or deactivate the
Amazon Glue Data Catalog as a metastore and modify application log settings. To learn more
about metastore configurations, see Metastore configuration for EMR Serverless. To learn more about application logging
options, see Storing logs.
-
Tags — Assign custom tags to the
application.
-
Choose Submit job.
- Hive
-
-
In the Name field, enter a name for your job run.
-
In the Runtime role field, enter the name of the IAM
role that your EMR Serverless application can assume for the job run.
-
In the Script location field, enter the Amazon S3 location for
the script or JAR that you want to run. For Hive jobs, the script must be a Hive
(.sql
) file.
-
(Optional) Enter values for the remaining fields.
-
Initialization script location – Enter the
location of the script that initializes tables before the Hive script
runs.
-
Hive properties – Expand the Hive properties
section and enter any Hive configuration parameters in this field.
-
Job configuration – Specify any job
configuration. You can use these job configurations to override the default
configurations for applications. For Hive jobs,
hive.exec.scratchdir
and
hive.metastore.warehouse.dir
are required properties in the
hive-site
configuration.
{
"applicationConfiguration": [
{
"classification": "hive-site",
"configurations": [],
"properties": {
"hive.exec.scratchdir": "s3://DOC-EXAMPLE_BUCKET
/hive/scratch",
"hive.metastore.warehouse.dir": "s3://DOC-EXAMPLE_BUCKET
/hive/warehouse"
}
}
],
"monitoringConfiguration": {}
}
-
Additional settings — Activate or deactivate
the Amazon Glue Data Catalog as a metastore and modify application log settings. To learn
more about metastore configurations, see Metastore configuration for EMR Serverless. To learn more about application logging
options, see Storing logs.
-
Tags — Assign any custom tags to the
application.
-
Choose Submit job.
View job runs
From the Job runs tab on an application’s
Details page, you can view job runs and perform the following actions
for job runs.
Cancel job — To cancel a job run that is in the
RUNNING
state, choose this option. To learn more about job run transitions,
see Job run states.
Clone job — To clone a previous job run and resubmit it,
choose this option.