View a markdown version of this page

Setting up multiple controller nodes for a SageMaker HyperPod Slurm cluster - Amazon SageMaker AI
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Setting up multiple controller nodes for a SageMaker HyperPod Slurm cluster

This topic explains how to configure multiple controller (head) nodes in a SageMaker HyperPod Slurm cluster using lifecycle scripts. Before you start, review the prerequisites listed in Prerequisites for using SageMaker HyperPod and familiarize yourself with the lifecycle scripts in Customizing SageMaker HyperPod clusters using lifecycle scripts. The instructions in this topic use Amazon CLI commands in Amazon Linux environment. Note that the environment variables used in these commands are available in the current session unless explicitly preserved.