DescribeCluster
Retrieves information of a SageMaker HyperPod cluster.
Request Syntax
{
"ClusterName": "string
"
}
Request Parameters
For information about the parameters that are common to all actions, see Common Parameters.
The request accepts the following data in JSON format.
- ClusterName
-
The string name or the Amazon Resource Name (ARN) of the SageMaker HyperPod cluster.
Type: String
Length Constraints: Minimum length of 0. Maximum length of 256.
Pattern:
(arn:aws[a-z\-]*:sagemaker:[a-z0-9\-]*:[0-9]{12}:cluster/[a-z0-9]{12})|([a-zA-Z0-9](-*[a-zA-Z0-9]){0,62})
Required: Yes
Response Syntax
{
"AutoScaling": {
"AutoScalerType": "string",
"FailureMessage": "string",
"Mode": "string",
"Status": "string"
},
"ClusterArn": "string",
"ClusterName": "string",
"ClusterRole": "string",
"ClusterStatus": "string",
"CreationTime": number,
"FailureMessage": "string",
"InstanceGroups": [
{
"CurrentCount": number,
"CurrentImageId": "string",
"DesiredImageId": "string",
"ExecutionRole": "string",
"InstanceGroupName": "string",
"InstanceStorageConfigs": [
{ ... }
],
"InstanceType": "string",
"LifeCycleConfig": {
"OnCreate": "string",
"SourceS3Uri": "string"
},
"OnStartDeepHealthChecks": [ "string" ],
"OverrideVpcConfig": {
"SecurityGroupIds": [ "string" ],
"Subnets": [ "string" ]
},
"ScheduledUpdateConfig": {
"DeploymentConfig": {
"AutoRollbackConfiguration": [
{
"AlarmName": "string"
}
],
"RollingUpdatePolicy": {
"MaximumBatchSize": {
"Type": "string",
"Value": number
},
"RollbackMaximumBatchSize": {
"Type": "string",
"Value": number
}
},
"WaitIntervalInSeconds": number
},
"ScheduleExpression": "string"
},
"Status": "string",
"TargetCount": number,
"ThreadsPerCore": number,
"TrainingPlanArn": "string",
"TrainingPlanStatus": "string"
}
],
"NodeProvisioningMode": "string",
"NodeRecovery": "string",
"Orchestrator": {
"Eks": {
"ClusterArn": "string"
}
},
"RestrictedInstanceGroups": [
{
"CurrentCount": number,
"EnvironmentConfig": {
"FSxLustreConfig": {
"PerUnitStorageThroughput": number,
"SizeInGiB": number
},
"S3OutputPath": "string"
},
"ExecutionRole": "string",
"InstanceGroupName": "string",
"InstanceStorageConfigs": [
{ ... }
],
"InstanceType": "string",
"OnStartDeepHealthChecks": [ "string" ],
"OverrideVpcConfig": {
"SecurityGroupIds": [ "string" ],
"Subnets": [ "string" ]
},
"ScheduledUpdateConfig": {
"DeploymentConfig": {
"AutoRollbackConfiguration": [
{
"AlarmName": "string"
}
],
"RollingUpdatePolicy": {
"MaximumBatchSize": {
"Type": "string",
"Value": number
},
"RollbackMaximumBatchSize": {
"Type": "string",
"Value": number
}
},
"WaitIntervalInSeconds": number
},
"ScheduleExpression": "string"
},
"Status": "string",
"TargetCount": number,
"ThreadsPerCore": number,
"TrainingPlanArn": "string",
"TrainingPlanStatus": "string"
}
],
"TieredStorageConfig": {
"InstanceMemoryAllocationPercentage": number,
"Mode": "string"
},
"VpcConfig": {
"SecurityGroupIds": [ "string" ],
"Subnets": [ "string" ]
}
}
Response Elements
If the action is successful, the service sends back an HTTP 200 response.
The following data is returned in JSON format by the service.
- AutoScaling
-
The current autoscaling configuration and status for the autoscaler.
Type: ClusterAutoScalingConfigOutput object
- ClusterArn
-
The Amazon Resource Name (ARN) of the SageMaker HyperPod cluster.
Type: String
Length Constraints: Minimum length of 0. Maximum length of 256.
Pattern:
arn:aws[a-z\-]*:sagemaker:[a-z0-9\-]*:[0-9]{12}:cluster/[a-z0-9]{12}
- ClusterName
-
The name of the SageMaker HyperPod cluster.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 63.
Pattern:
[a-zA-Z0-9](-*[a-zA-Z0-9])*
- ClusterRole
-
The Amazon Resource Name (ARN) of the IAM role that HyperPod uses for cluster autoscaling operations.
Type: String
Length Constraints: Minimum length of 20. Maximum length of 2048.
Pattern:
arn:aws[a-z\-]*:iam::\d{12}:role/?[a-zA-Z_0-9+=,.@\-_/]+
- ClusterStatus
-
The status of the SageMaker HyperPod cluster.
Type: String
Valid Values:
Creating | Deleting | Failed | InService | RollingBack | SystemUpdating | Updating
- CreationTime
-
The time when the SageMaker Cluster is created.
Type: Timestamp
- FailureMessage
-
The failure message of the SageMaker HyperPod cluster.
Type: String
- InstanceGroups
-
The instance groups of the SageMaker HyperPod cluster.
Type: Array of ClusterInstanceGroupDetails objects
- NodeProvisioningMode
-
The mode used for provisioning nodes in the cluster.
Type: String
Valid Values:
Continuous
- NodeRecovery
-
The node recovery mode configured for the SageMaker HyperPod cluster.
Type: String
Valid Values:
Automatic | None
- Orchestrator
-
The type of orchestrator used for the SageMaker HyperPod cluster.
Type: ClusterOrchestrator object
- RestrictedInstanceGroups
-
The specialized instance groups for training models like Amazon Nova to be created in the SageMaker HyperPod cluster.
Type: Array of ClusterRestrictedInstanceGroupDetails objects
- TieredStorageConfig
-
The current configuration for managed tier checkpointing on the HyperPod cluster. For example, this shows whether the feature is enabled and the percentage of cluster memory allocated for checkpoint storage.
Type: ClusterTieredStorageConfig object
- VpcConfig
-
Specifies an Amazon Virtual Private Cloud (VPC) that your SageMaker jobs, hosted models, and compute resources have access to. You can control access to and from your resources by configuring a VPC. For more information, see Give SageMaker Access to Resources in your Amazon VPC.
Type: VpcConfig object
Errors
For information about the errors that are common to all actions, see Common Errors.
- ResourceNotFound
-
Resource being access is not found.
HTTP Status Code: 400
See Also
For more information about using this API in one of the language-specific Amazon SDKs, see the following: