Configuring SageMaker Debugger to save tensors

Tensors are data collections of updated parameters from the backward and forward pass of each training iteration. SageMaker Debugger collects the output tensors to analyze the state of a training job. SageMaker Debugger's CollectionConfig and DebuggerHookConfig API operations provide methods for grouping tensors into collections and saving them to a target S3 bucket. The following topics show how to use the CollectionConfig and DebuggerHookConfig API operations, followed by examples of how to use Debugger hook to save, access, and visualize output tensors.

While constructing a SageMaker AI estimator, activate SageMaker Debugger by specifying the debugger_hook_config parameter. The following topics include examples of how to set up the debugger_hook_config using the CollectionConfig and DebuggerHookConfig API operations to pull tensors out of your training jobs and save them.

Note

After properly configured and activated, SageMaker Debugger saves the output tensors in a default S3 bucket, unless otherwise specified. The format of the default S3 bucket URI is s3://amzn-s3-demo-bucket-sagemaker-<region>-<12digit_account_id>/<training-job-name>/debug-output/.

Topics

Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Launch training jobs with Debugger using the SageMaker Python SDK

Configure Tensor Collections