Deploy a Compiled Model Using the Console - Amazon SageMaker
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Deploy a Compiled Model Using the Console

You must satisfy the prerequisites section if the model was compiled using Amazon SDK for Python (Boto3), the Amazon CLI, or the Amazon SageMaker console. Follow the steps below to create and deploy a SageMaker Neo-compiled model using the SageMaker console SageMaker.

Deploy the Model

After you have satisfied the prerequisites, use the following steps to deploy a model compiled with Neo:

  1. Choose Models, and then choose Create models from the Inference group. On the Create model page, complete the Model name, IAM role, and VPC fields (optional), if needed.

    Create Neo model for inference
  2. To add information about the container used to deploy your model, choose Add container container, then choose Next. Complete the Container input options, Location of inference code image, and Location of model artifacts, and optionally, Container host name, and Environmental variables fields.

    Create Neo model for inference
  3. To deploy Neo-compiled models, choose the following:

    • Container input options: Choose Provide model artifacts and inference image.

    • Location of inference code image: Choose the inference image URI from Neo Inference Container Images, depending on the Amazon Region and kind of application.

    • Location of model artifact: Enter the Amazon S3 bucket URI of the compiled model artifact generated by the Neo compilation API.

    • Environment variables:

      • Leave this field blank for SageMaker XGBoost.

      • If you trained your model using SageMaker, specify the environment variable SAGEMAKER_SUBMIT_DIRECTORY as the Amazon S3 bucket URI that contains the training script.

      • If you did not train your model using SageMaker, specify the following environment variables:

        Key Values for MXNet and PyTorch Values TensorFlow
        SAGEMAKER_SUBMIT_DIRECTORY /opt/ml/model/code /opt/ml/model/code
        SAGEMAKER_REGION <your region> <your region>
        MMS_DEFAULT_RESPONSE_TIMEOUT 500 Leave this field blank for TF
  4. Confirm that the information for the containers is accurate, and then choose Create model. On the Create model landing page, choose Create endpoint.

    Create Model landing page
  5. In Create and configure endpoint diagram, specify the Endpoint name. For Attach endpoint configuration, choose Create a new endpoint configuration.

    Neo console create and configure endpoint UI.
  6. In New endpoint configuration page, specify the Endpoint configuration name.

    Neo console new endpoint configuration UI.
  7. Choose Edit next to the name of the model and specify the correct Instance type on the Edit Production Variant page. It is imperative that the Instance type value match the one specified in your compilation job.

    Neo console new endpoint configuration UI.
  8. Choose Save.

  9. On the New endpoint configuration page, choose Create endpoint configuration, and then choose Create endpoint.