Troubleshooting your Docker containers
The following are common errors that you might run into when using Docker containers with SageMaker. Each error is followed by a solution to the error.
-
Error: SageMaker has lost the Docker daemon.
To fix this error, restart Docker using the following command.
sudo service docker restart
-
Error: The
/tmp
directory of your Docker container has run out of space.Docker containers use the
/
and/tmp
partitions to store code. These partitions can fill up easily when using large code modules in local mode. The SageMaker Python SDK supports specifying a custom temp directory for your local mode root directory to avoid this issue.To specify the custom temp directory in the Amazon Elastic Block Store volume storage, create a file at the following path
~/.sagemaker/config.yaml
and add the following configuration. The directory that you specify ascontainer_root
must already exist. The SageMaker Python SDK will not try to create it.local: container_root: /home/ec2-user/SageMaker/temp
With this configuration, local mode uses the
/temp
directory and not the default/tmp
directory. -
Low space errors on SageMaker notebook instances
A Docker container that runs on SageMaker notebook instances uses the root Amazon EBS volume of the notebook instance by default. To resolve low space errors, provide the path of the Amazon EBS volume attached to the notebook instance as part of the volume parameter of Docker commands.
docker run -v
EBS-volume-path
:container-path