Configure a job run to use Amazon CloudWatch Logs - Amazon EMR
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Configure a job run to use Amazon CloudWatch Logs

To monitor job progress and to troubleshoot failures, you must configure your jobs to send log information to Amazon S3, Amazon CloudWatch Logs, or both. This topic helps you get started using CloudWatch Logs on your jobs that are launched with Amazon EMR on EKS. For more information about CloudWatch Logs, see Monitoring Log Files in the Amazon CloudWatch User Guide.

CloudWatch Logs IAM policy

For your jobs to send log data to CloudWatch Logs, the following permissions must be included in the permissions policy for the job execution role. Replace my_log_group_name and my_log_stream_prefix with names of your CloudWatch log group and log stream names, respectively. Amazon EMR on EKS creates the log group and log stream if they do not exist as long as the execution role ARN has appropriate permissions.

{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "logs:CreateLogStream", "logs:DescribeLogGroups", "logs:DescribeLogStreams" ], "Resource": [ "arn:aws:logs:*:*:*" ] }, { "Effect": "Allow", "Action": [ "logs:PutLogEvents" ], "Resource": [ "arn:aws:logs:*:*:log-group:my_log_group_name:log-stream:my_log_stream_prefix/*" ] } ] }
Note

Amazon EMR on EKS can also create a log stream. If a log stream does not exist, the IAM policy should include the"logs:CreateLogGroup" permission.

After you've given your execution role the proper permissions, your application sends its log data to CloudWatch Logs when cloudWatchMonitoringConfiguration is passed in the monitoringConfiguration section of a start-job-run request, as shown in Managing job runs with the Amazon CLI.

In the StartJobRun API, log_group_name is the log group name for CloudWatch, and log_stream_prefix is the log stream name prefix for CloudWatch. You can view and search these logs in the Amazon Web Services Management Console.

  • Submitter logs - logGroup/logStreamPrefix/virtual-cluster-id/jobs/job-id/containers/pod-name/(stderr/stdout)

  • Driver logs - logGroup/logStreamPrefix/virtual-cluster-id/jobs/job-id/containers/spark-application-id/spark-job-id-driver/(stderrstdout)

  • Executor logs - logGroup/logStreamPrefix/virtual-cluster-id/jobs/job-id/containers/spark-application-id/executor-pod-name/(stderr/stdout)