Create a SageMaker HyperPod cluster on training plans using the SageMaker API, or Amazon CLI - Amazon SageMaker AI
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Create a SageMaker HyperPod cluster on training plans using the SageMaker API, or Amazon CLI

To use SageMaker training plans for your Amazon SageMaker HyperPod cluster, specify the ARN of the training plan you want to use in the TrainingPlanArn parameter of the ClusterInstanceGroupSpecification when calling the CreateCluster API operation.

Ensure that the subnet associated with the designated AZ of your plan is included in the VPCConfig of your cluster configuration. You can retrieve the AvailabilityZone of a training plan in the response of a DescribeTrainingPlan API call.

The following sample illustrates how to create a new SageMaker HyperPod cluster and provide an instance group with a training plan in the --instance-groups attribute of the create-cluster Amazon CLI command.

# Create a cluster aws sagemaker create-cluster \ --cluster-name cluster-name \ --instance-groups '[ \ { \ "InstanceCount": 1,\ "InstanceGroupName": "controller-nodes",\ "InstanceType": "ml.t3.xlarge",\ "LifeCycleConfig": {"SourceS3Uri": source_s3_uri, "OnCreate": "on_create.sh"},\ "ExecutionRole": "arn:aws:iam::customer_account_id:role/execution_role",\ "ThreadsPerCore": 1,\ },\ { \ "InstanceCount": 2, \ "InstanceGroupName": "worker-nodes",\ "InstanceType": "p4d.24xlarge",\ "LifeCycleConfig": {"SourceS3Uri": source_s3_uri, "OnCreate": "on_create.sh"},\ "ExecutionRole": "arn:aws:iam::customer_account_id}:role/execution_role}",\ "ThreadsPerCore": 1,\ "TrainingPlanArn": training_plan_arn,\ }]'

For information about how to create an HyperPod cluster using the Amazon CLI, see create-cluster.

After creating the cluster, you can verify that your instance group was properly assigned capacity from the training plan by calling the DescribeCluster API.

aws sagemaker describe-cluster --cluster-name cluster-name