View job queue status - Amazon Batch
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

View job queue status

After you create a job queue and submit the jobs, it is important to be able to monitor its progress. You can use the Job details page to review, manage, and monitor your job queue.

View job queue information

From the Amazon Batch console, select Job queues in navigation pane and choose your desired job queue to view its details. On this page, you can review and manage your job queue and see additional information about the queue’s operations, such as the job queue snapshot, job state limits, environment order, tags, and the job queue’s JSON code.

Job queue details

This section provides an overview and maintenance options for the job queue. It is important to note that you can find the Amazon Resource Name (ARN) in this section.

To find this information through the Amazon Command Line Interface, use the DescribeJobQueues operation along with the job queue name, or the corresponding ARN.

Job queue snapshot

This section provides a static list of the first 100 RUNNABLEjobs that are in queue. You can use the search field to narrow the list by searching for information from any column in the results section. The jobs in the snapshot results area are sorted according to the job queue’s run strategy. For first-in-first-out (FIFO) job queues, the ordering of the jobs is based on the submission time. For fair-share scheduling job queues, the ordering of the jobs is based on the job priority and share usage.

Because the results are a snapshot of the job queue, the results list doesn’t automatically update. To update the list, choose the refresh at the top of the section. Choose the job’s name hyperlink to navigate to Job details and view the job’s status and other related information.

To find this information through the Amazon CLI, use the GetJobQueueSnapshot operation along with the job queue name or the corresponding ARN.

aws batch get-job-queue-snapshot --job-queue my-sm-training-fifo-jq

Job state limits

Use this tab to review configuration information about the amount of time that a job can remain in a RUNNABLE state before it’s canceled.

To find this information through the Amazon CLI, use the DescribeJobQueues operation along with the job queue name or the corresponding ARN.

Environment order

If your job queue runs in multiple environments, this tab provides their order and an overview.

To find this information through the Amazon CLI, use the DescribeJobQueues operation along with the job queue name or the corresponding ARN.

Tags

Use this tab to review and manage tags that are associated to this job queue.

JSON

Use this tab to copy the JSON code that’s associated with this job queue. You can then reuse the JSON for Amazon CloudFormation templates and Amazon CLI scripts.

Monitor service jobs

You can monitor the status of service jobs in your job queue using several Amazon Batch commands. Service jobs are jobs that run on Amazon services such as SageMaker Training, where Amazon Batch provides scheduling and queuing capabilities while the target service handles job execution.

List service jobs by status

Use the ListServiceJobs operation to view service jobs in your queue filtered by status. Service jobs can have the following statuses:

  • SUBMITTED - Job has been submitted but not yet processed

  • PENDING - Job is pending and waiting for resources

  • RUNNABLE - Job is ready to run and waiting in the queue

  • STARTING - Job is being started

  • RUNNING - Job is currently running

  • SCHEDULED - Job has been submitted to the target service but not yet running

  • SUCCEEDED - Job completed successfully

  • FAILED - Job failed to complete

View running jobs in your queue:

aws batch list-service-jobs \ --job-queue my-sm-training-fifo-jq \ --job-status RUNNING

View jobs waiting in the queue:

aws batch list-service-jobs \ --job-queue my-sm-training-fifo-jq \ --job-status RUNNABLE

View jobs that have been submitted to SageMaker but not yet running:

aws batch list-service-jobs \ --job-queue my-sm-training-fifo-jq \ --job-status SCHEDULED

View all succeeded jobs:

aws batch list-service-jobs \ --job-queue my-sm-training-fifo-jq \ --job-status SUCCEEDED

View failed jobs for troubleshooting:

aws batch list-service-jobs \ --job-queue my-sm-training-fifo-jq \ --job-status FAILED

Filter service jobs

You can filter service jobs by name using pattern matching. If a filter value ends with an asterisk (*), it matches any job name that begins with the string before the '*'.

Find jobs with names starting with "training":

aws batch list-service-jobs \ --job-queue my-sm-training-fifo-jq \ --filters name=JOB_NAME,values=training*

Find jobs with specific names:

aws batch list-service-jobs \ --job-queue my-sm-training-fifo-jq \ --filters name=JOB_NAME,values=my-training-job-1,my-training-job-2

Combine status and name filters:

aws batch list-service-jobs \ --job-queue my-sm-training-fifo-jq \ --job-status RUNNING \ --filters name=JOB_NAME,values=production*

Handle large result sets

When you have many service jobs, use pagination to manage the results effectively.

Limit the number of results returned:

aws batch list-service-jobs \ --job-queue my-sm-training-fifo-jq \ --max-results 10

Use the next token to get additional results:

aws batch list-service-jobs \ --job-queue my-sm-training-fifo-jq \ --max-results 10 \ --next-token eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...

Get detailed service job information

Use the DescribeServiceJob operation to get comprehensive information about a specific service job, including its current status, service resource identifiers, and detailed attempt information.

View detailed information about a specific job:

aws batch describe-service-job \ --job-id a4d6c728-8ee8-4c65-8e2a-9a5e8f4b7c3d

This command returns comprehensive information about the job, including:

  • Job ARN and current status

  • Service resource identifiers (such as SageMaker Training job ARN)

  • Scheduling priority and retry configuration

  • Service request payload containing the original service parameters

  • Detailed attempt information with start and stop times

  • Status messages from the target service

Monitor SageMaker Training jobs

When monitoring SageMaker Training jobs through Amazon Batch, you can access both Amazon Batch job information and the underlying SageMaker Training job details.

The service resource identifier in the job details contains the SageMaker Training job ARN:

{ "latestAttempt": { "serviceResourceId": { "name": "TrainingJobArn", "value": "arn:aws:sagemaker:us-east-1:123456789012:training-job/my-training-job" } } }

You can use this ARN to get additional details directly from SageMaker:

aws sagemaker describe-training-job \ --training-job-name my-training-job

Monitor job progress by checking both Amazon Batch status and SageMaker Training job status. The Amazon Batch job status shows the overall job lifecycle, while the SageMaker Training job status provides service-specific details about the training process.

Terminate service jobs

Use the TerminateServiceJob operation to stop a running service job.

Terminate a specific service job:

aws batch terminate-service-job \ --job-id a4d6c728-8ee8-4c65-8e2a-9a5e8f4b7c3d \ --reason "Job terminated by user request"

When you terminate a service job, Amazon Batch stops the job and notifies the target service. For SageMaker Training jobs, this will stop the training job in SageMaker AI as well.