Get started with ML - Amazon EKS
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Help improve this page

Want to contribute to this user guide? Choose the Edit this page on GitHub link that is located in the right pane of every page. Your contributions will help make our user guide better for everyone.

Get started with ML

To jump into Machine Learning on EKS, start by choosing from these prescriptive patterns to quickly get an EKS cluster and ML software and hardware ready to begin running ML workloads. Most of these patterns are based on Terraform blueprints that are available from the Data on Amazon EKS site. Before you begin, here are few things to keep in mind:

  • GPUs or Neuron instances are required to run these procedures. Lack of availability of these resources can cause these procedures to fail during cluster creation or node autoscaling.

  • Neuron SDK (Tranium and Inferentia-based instances) can save money and are more available than NVIDIA GPUs. So, when your workloads permit it, we recommend that you consider using Neutron for your Machine Learning workloads (see Welcome to Amazon Neuron).

  • Some of the getting started experiences here require that you get data via your own Hugging Face account.

To get started, choose from the following selection of patterns that are designed to get you started setting up infrastructure to run your Machine Learning workloads:

Continuing with ML on EKS

Along with choosing from the blueprints described on this page, there are other ways you can proceed through the ML on EKS documentation if you prefer. For example, you can:

To improve your work with ML on EKS, refer to the following:

  • Prepare for ML – Learn how to prepare for ML on EKS with features like custom AMIs and GPU reservations. See Prepare for ML clusters.