

# SageMaker HyperPod cluster resiliency
<a name="sagemaker-hyperpod-resiliency-slurm"></a>

SageMaker HyperPod through Slurm orchestration provides the following cluster resiliency features.

**Topics**
+ [Health monitoring agent](sagemaker-hyperpod-resiliency-slurm-cluster-health-check.md)
+ [Deep health checks](sagemaker-hyperpod-resiliency-slurm-deep-health-checks.md)
+ [Automatic node recovery and auto-resume](sagemaker-hyperpod-resiliency-slurm-auto-resume.md)
+ [Manually replace or reboot a node using Slurm](sagemaker-hyperpod-resiliency-slurm-replace-faulty-instance.md)