Amazon Batch on Amazon EKS - Amazon Batch
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Amazon Batch on Amazon EKS

Amazon Batch simplifies your batch workloads on Amazon EKS clusters by providing managed batch capabilities. This includes queuing, dependency tracking, managed job retries and priorities, pod management, and node scaling. Amazon Batch can handle multiple Availability Zones and multiple Amazon EC2 instance types and sizes. Amazon Batch integrates several of the Amazon EC2 Spot best practices to run your workloads in a fault-tolerant manner, allowing for fewer interruptions. You can use Amazon Batch to run a handful of overnight jobs or millions of mission-critical jobs with confidence.

Amazon Batch is a managed service that orchestrates batch workloads in your Kubernetes clusters that are managed by Amazon Elastic Kubernetes Service (Amazon EKS). Amazon Batch conducts this orchestration external to your clusters using an “overlay” model. Since Amazon Batch is a managed service, there are no Kubernetes components (for example, Operators or Custom Resources) to install or manage in your cluster. Amazon Batch only needs your cluster to be configured with Role-Based Access Controls (RBAC) that allow Amazon Batch to communicate with the Kubernetes API server. Amazon Batch calls Kubernetes APIs to create, monitor, and delete Kubernetes pods and nodes.

Amazon Batch has built-in scaling logic to scale Kubernetes nodes based on job queue load with optimizations in terms of job capacity allocations. When the job queue is empty, Amazon Batch scales down the nodes to the minimum capacity that you set, which by default is zero. Amazon Batch manages the full lifecycle of these nodes, and decorates the nodes with labels and taints. This way, other Kubernetes workloads aren't placed on the nodes managed by Amazon Batch. The exception to this are DaemonSets, which can target Amazon Batch nodes to provide monitoring and other functionality required for proper execution of the jobs. Additionally, Amazon Batch doesn't run jobs, specifically pods, on nodes in your cluster that it doesn't manage. This way, you can use separate scaling logic and services for other applications on the cluster.

To submit jobs to Amazon Batch, you interact directly with the Amazon Batch API. Amazon Batch translates jobs into podspecs and then creates the requests to place pods on nodes managed by Amazon Batch in your Amazon EKS cluster. You can use tools such as kubectl to view running pods and nodes. When a pod has completed its execution, Amazon Batch deletes the pod it created to maintain a lower load on the Kubernetes system.

You can get started by connecting a valid Amazon EKS cluster with Amazon Batch. Then attach an Amazon Batch job queue to it, and register an Amazon EKS job definition using podspec equivalent attributes. Last, submit jobs using the SubmitJob API operation referencing to the job definition. For more information, see Getting started with Amazon Batch on Amazon EKS.