Dockerfile specifications - Amazon SageMaker
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Dockerfile specifications

The image that you specify in your Dockerfile must match the specifications in the following sections to create the image successfully.

Running the image

  • Entrypoint – We recommend embedding the entry point into the image using the Docker CMD or Entrypoint instructions. You can also configure ContainerEntrypoint and ContainerArguments that are passed to the container at runtime.

  • EnvVariables – With Studio, you can configure ContainerEnvironment variables that are made available to a container. The environment variable is overwritten with the environment variables from SageMaker. To provide you with a better experience, the environment variables are usually Amazon_ and SageMaker_namespaced to give priority to platform environments.

    The following are the environment variables:

    • Amazon_REGION

    • Amazon_DEFAULT_REGION

    • Amazon_CONTAINER_CREDENTIALS_RELATIVE_URI

    • SageMaker_SPACE_NAME

Specifications for the user and file system

  • WorkingDirectory – The Amazon EBS volume for your space is mounted on the path /home/sagemaker-user. You can't change the mount path. Use the WORKDIR instruction to set the working directory of your image to a folder within /home/sagemaker-user.

  • UID – The user ID of the Docker container. UID=1000 is a supported value. You can add sudo access to your users. The IDs are remapped to prevent a process running in the container from having more privileges than necessary.

  • GID – The group ID of the Docker container. GID=100 is a supported value. You can add sudo access to your users. The IDs are remapped to prevent a process running in the container from having more privileges than necessary.

  • Metadata directories – The /opt/.sagemakerintenral and /opt/ml directories that are used by Amazon. The metadata file in /opt/ml contains metadata about resources such as DomainId.

    Use the following command to show the file system contents:

    cat /opt/ml/metadata/resource-metadata.json {"AppType":"JupyterLab","DomainId":"example-domain-id","UserProfileName":"example-user-profile-name,"ResourceArn":"arn:aws:sagemaker:Amazon Web Services Region:111122223333;:app/domain-ID/user-ID/Jupyte rLab/default","ResourceName":"default","AppImageVersion":"current"}
  • Logging directories – /var/logs/studio are reserved for the logging directories of JupyterLab and the extensions associated with it. We recommend that you don't use the folders in creating your image.

Health check and URL for applications

  • Base URL – The base URL for the BYOI application must be jupyterlab/default. You can only have one application and it must always be named default.

  • HealthCheck API – The HostAgent uses the HealthCheckAPI at port 8888 to check the health of the JupyterLab application. jupyterlab/default/api/status is the endpoint for the health check.

  • Home/Default URL – The /opt/.sagemakerinternal and /opt/ml directories that are used by Amazon. The metadata file in /opt/ml contains metadata about resources such as DomainId.

  • Authentication – To enable authentication for your users, turn off the Jupyter notebooks token or password based authentication and allow all origins.

The following is a sample Amazon Linux 2 Dockerfile that meets the preceding specifications:

FROM public.ecr.aws/amazonlinux/amazonlinux:2 ARG NB_USER="sagemaker-user" ARG NB_UID="1000" ARG NB_GID="100" RUN yum install --assumeyes python3 shadow-utils && \ useradd --create-home --shell /bin/bash --gid "${NB_GID}" --uid ${NB_UID} ${NB_USER} && \ yum clean all && \ python3 -m pip install jupyterlab RUN python3 -m pip install --upgrade pip RUN python3 -m pip install --upgrade urllib3==1.26.6 USER ${NB_UID} CMD jupyter lab --ip 0.0.0.0 --port 8888 \ --ServerApp.base_url="/jupyterlab/default" \ --ServerApp.token='' \ --ServerApp.allow_origin='*'

The following is a sample Amazon SageMaker Distribution Dockerfile that meets the preceding specifications:

FROM public.ecr.aws/sagemaker/sagemaker-distribution:latest-cpu ARG NB_USER="sagemaker-user" ARG NB_UID=1000 ARG NB_GID=100 ENV MAMBA_USER=$NB_USER USER root RUN apt-get update RUN micromamba install sagemaker-inference --freeze-installed --yes --channel conda-forge --name base USER $MAMBA_USER ENTRYPOINT ["jupyter-lab"] CMD ["--ServerApp.ip=0.0.0.0", "--ServerApp.port=8888", "--ServerApp.allow_origin=*", "--ServerApp.token=''", "--ServerApp.base_url=/jupyterlab/default"]