Job monitoring and debugging

You can collect metrics about Amazon Glue jobs and visualize them on the Amazon Glue and Amazon CloudWatch consoles to identify and fix issues. Profiling your Amazon Glue jobs requires the following steps:

Enable metrics:
1. Enable the Job metrics option in the job definition. You can enable profiling in the Amazon Glue console or as a parameter to the job. For more information see Defining job properties for Spark jobs or Using job parameters in Amazon Glue jobs.
2. Enable the Amazon Glue Observability metrics option in the job definition. You can enable Observability in the Amazon Glue console or as a parameter to the job. For more information see Monitoring with Amazon Glue Observability metrics.

Confirm that the job script initializes a GlueContext. For example, the following script snippet initializes a GlueContext and shows where profiled code is placed in the script. This general format is used in the debugging scenarios that follow.



import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
import time

## @params: [JOB_NAME]
args = getResolvedOptions(sys.argv, ['JOB_NAME'])

sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args['JOB_NAME'], args)

...
...
code-to-profile
...
...


job.commit()

Run the job.
Visualize the metrics:
1. Visualize job metrics on the Amazon Glue console and identify abnormal metrics for the driver or an executor.
2. Check observability metrics in the Job run monitoring page, job run details page, or on Amazon CloudWatch. For more information, see Monitoring with Amazon Glue Observability metrics.
Narrow down the root cause using the identified metric.
Optionally, confirm the root cause using the log stream of the identified driver or job executor.

Use cases for Amazon Glue observability metrics

Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Monitoring with Amazon Glue Observability metrics

Debugging OOM exceptions and job abnormalities