Supported Frameworks and Algorithms - Amazon SageMaker
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Supported Frameworks and Algorithms

The following table shows SageMaker machine learning frameworks and algorithms supported by Debugger.

SageMaker-supported frameworks and algorithms Debugging output tensors

TensorFlow

Amazon TensorFlow deep learning containers 1.15.4 or later

PyTorch

Amazon PyTorch deep learning containers 1.5.0 or later

MXNet

Amazon MXNet deep learning containers 1.6.0 or later

XGBoost

1.0-1, 1.2-1, 1.3-1

SageMaker generic estimator

Custom training containers (available for TensorFlow, PyTorch, MXNet, and XGBoost with manual hook registration)

  • Debugging output tensors – Track and debug model parameters, such as weights, gradients, biases, and scalar values of your training job. Available deep learning frameworks are Apache MXNet, TensorFlow, PyTorch, and XGBoost.

    Important

    For the TensorFlow framework with Keras, SageMaker Debugger deprecates the zero code change support for debugging models built using the tf.keras modules of TensorFlow 2.6 and later. This is due to breaking changes announced in the TensorFlow 2.6.0 release note. For instructions on how to update your training script, see Adapt Your TensorFlow Training Script.

    Important

    From PyTorch v1.12.0 and later, SageMaker Debugger deprecates the zero code change support for debugging models.

    This is due to breaking changes that cause SageMaker Debugger to interfere with the torch.jit functionality. For instructions on how to update your training script, see Adapt Your PyTorch Training Script.

If the framework or algorithm that you want to train and debug is not listed in the table, go to the Amazon Discussion Forum and leave feedback on SageMaker Debugger.

Amazon Web Services Regions

Amazon SageMaker Debugger is available in all regions where Amazon SageMaker is in service except the following region.

  • Asia Pacific (Jakarta): ap-southeast-3

To find if Amazon SageMaker is in service in your Amazon Web Services Region, see Amazon Regional Services.

Use Debugger with Custom Training Containers

Bring your training containers to SageMaker and gain insights into your training jobs using Debugger. Maximize your work efficiency by optimizing your model on Amazon EC2 instances using the monitoring and debugging features.

For more information about how to build your training container with the sagemaker-debugger client library, push it to the Amazon Elastic Container Registry (Amazon ECR), and monitor and debug, see Use Debugger with Custom Training Containers.

Debugger Open-Source GitHub Repositories

Debugger APIs are provided through the SageMaker Python SDK and designed to construct Debugger hook and rule configurations for the SageMaker CreateTrainingJob and DescribeTrainingJob API operations. The sagemaker-debugger client library provides tools to register hooks and access the training data through its trial feature, all through its flexible and powerful API operations. It supports the machine learning frameworks TensorFlow, PyTorch, MXNet, and XGBoost on Python 3.6 and later.

For direct resources about the Debugger and sagemaker-debugger API operations, see the following links:

If you use the SDK for Java to conduct SageMaker training jobs and want to configure Debugger APIs, see the following references: