Multi-node parallel jobs on Amazon EKS
You can use Amazon Batch on Amazon Elastic Kubernetes Service to run multi-node parallel (MNP) jobs (also known as gang scheduling) on your managed Kubernetes clusters. This option is commonly used for large, tightly-coupled, high-performance jobs that can’t be run on a single Amazon Elastic Compute Cloud instance. For more information, see Multi-node parallel jobs.
You can use this feature to run Amazon EKS managed Kubernetes-specific high-performance computing applications, large language model training, and other Artificial Intelligence (AI)/Machine Learning (ML) jobs.