UpdateCluster
Updates a SageMaker HyperPod cluster.
Request Syntax
{
"AutoScaling": {
"AutoScalerType": "string
",
"Mode": "string
"
},
"ClusterName": "string
",
"ClusterRole": "string
",
"InstanceGroups": [
{
"ExecutionRole": "string
",
"ImageId": "string
",
"InstanceCount": number
,
"InstanceGroupName": "string
",
"InstanceStorageConfigs": [
{ ... }
],
"InstanceType": "string
",
"LifeCycleConfig": {
"OnCreate": "string
",
"SourceS3Uri": "string
"
},
"OnStartDeepHealthChecks": [ "string
" ],
"OverrideVpcConfig": {
"SecurityGroupIds": [ "string
" ],
"Subnets": [ "string
" ]
},
"ScheduledUpdateConfig": {
"DeploymentConfig": {
"AutoRollbackConfiguration": [
{
"AlarmName": "string
"
}
],
"RollingUpdatePolicy": {
"MaximumBatchSize": {
"Type": "string
",
"Value": number
},
"RollbackMaximumBatchSize": {
"Type": "string
",
"Value": number
}
},
"WaitIntervalInSeconds": number
},
"ScheduleExpression": "string
"
},
"ThreadsPerCore": number
,
"TrainingPlanArn": "string
"
}
],
"InstanceGroupsToDelete": [ "string
" ],
"NodeRecovery": "string
",
"RestrictedInstanceGroups": [
{
"EnvironmentConfig": {
"FSxLustreConfig": {
"PerUnitStorageThroughput": number
,
"SizeInGiB": number
}
},
"ExecutionRole": "string
",
"InstanceCount": number
,
"InstanceGroupName": "string
",
"InstanceStorageConfigs": [
{ ... }
],
"InstanceType": "string
",
"OnStartDeepHealthChecks": [ "string
" ],
"OverrideVpcConfig": {
"SecurityGroupIds": [ "string
" ],
"Subnets": [ "string
" ]
},
"ScheduledUpdateConfig": {
"DeploymentConfig": {
"AutoRollbackConfiguration": [
{
"AlarmName": "string
"
}
],
"RollingUpdatePolicy": {
"MaximumBatchSize": {
"Type": "string
",
"Value": number
},
"RollbackMaximumBatchSize": {
"Type": "string
",
"Value": number
}
},
"WaitIntervalInSeconds": number
},
"ScheduleExpression": "string
"
},
"ThreadsPerCore": number
,
"TrainingPlanArn": "string
"
}
],
"TieredStorageConfig": {
"InstanceMemoryAllocationPercentage": number
,
"Mode": "string
"
}
}
Request Parameters
For information about the parameters that are common to all actions, see Common Parameters.
The request accepts the following data in JSON format.
- AutoScaling
-
Updates the autoscaling configuration for the cluster. Use to enable or disable automatic node scaling.
Type: ClusterAutoScalingConfig object
Required: No
- ClusterName
-
Specify the name of the SageMaker HyperPod cluster you want to update.
Type: String
Length Constraints: Minimum length of 0. Maximum length of 256.
Pattern:
(arn:aws[a-z\-]*:sagemaker:[a-z0-9\-]*:[0-9]{12}:cluster/[a-z0-9]{12})|([a-zA-Z0-9](-*[a-zA-Z0-9]){0,62})
Required: Yes
- ClusterRole
-
The Amazon Resource Name (ARN) of the IAM role that HyperPod assumes for cluster autoscaling operations. Cannot be updated while autoscaling is enabled.
Type: String
Length Constraints: Minimum length of 20. Maximum length of 2048.
Pattern:
arn:aws[a-z\-]*:iam::\d{12}:role/?[a-zA-Z_0-9+=,.@\-_/]+
Required: No
- InstanceGroups
-
Specify the instance groups to update.
Type: Array of ClusterInstanceGroupSpecification objects
Array Members: Minimum number of 1 item. Maximum number of 100 items.
Required: No
- InstanceGroupsToDelete
-
Specify the names of the instance groups to delete. Use a single
,
as the separator between multiple names.Type: Array of strings
Array Members: Minimum number of 0 items. Maximum number of 100 items.
Length Constraints: Minimum length of 1. Maximum length of 63.
Pattern:
[a-zA-Z0-9](-*[a-zA-Z0-9])*
Required: No
- NodeRecovery
-
The node recovery mode to be applied to the SageMaker HyperPod cluster.
Type: String
Valid Values:
Automatic | None
Required: No
- RestrictedInstanceGroups
-
The specialized instance groups for training models like Amazon Nova to be created in the SageMaker HyperPod cluster.
Type: Array of ClusterRestrictedInstanceGroupSpecification objects
Array Members: Minimum number of 1 item. Maximum number of 100 items.
Required: No
- TieredStorageConfig
-
Updates the configuration for managed tier checkpointing on the HyperPod cluster. For example, you can enable or disable the feature and modify the percentage of cluster memory allocated for checkpoint storage.
Type: ClusterTieredStorageConfig object
Required: No
Response Syntax
{
"ClusterArn": "string"
}
Response Elements
If the action is successful, the service sends back an HTTP 200 response.
The following data is returned in JSON format by the service.
- ClusterArn
-
The Amazon Resource Name (ARN) of the updated SageMaker HyperPod cluster.
Type: String
Length Constraints: Minimum length of 0. Maximum length of 256.
Pattern:
arn:aws[a-z\-]*:sagemaker:[a-z0-9\-]*:[0-9]{12}:cluster/[a-z0-9]{12}
Errors
For information about the errors that are common to all actions, see Common Errors.
- ConflictException
-
There was a conflict when you attempted to modify a SageMaker entity such as an
Experiment
orArtifact
.HTTP Status Code: 400
- ResourceLimitExceeded
-
You have exceeded an SageMaker resource limit. For example, you might have too many training jobs created.
HTTP Status Code: 400
- ResourceNotFound
-
Resource being access is not found.
HTTP Status Code: 400
See Also
For more information about using this API in one of the language-specific Amazon SDKs, see the following: