Deploy a Compiled Model Using the Amazon CLI - Amazon SageMaker
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Deploy a Compiled Model Using the Amazon CLI

You must satisfy the prerequisites section if the model was compiled using Amazon SDK for Python (Boto3), Amazon CLI, or the Amazon SageMaker console. Follow the steps below to create and deploy a SageMaker Neo-compiled model using the Amazon CLI.

Deploy the Model

After you have satisfied the prerequisites, use the create-model, create-enpoint-config, and create-endpoint Amazon CLI commands. The following steps explain how to use these commands to deploy a model compiled with Neo:

Create a Model

From Neo Inference Container Images, select the inference image URI and then use create-model API to create a SageMaker model. You can do this with two steps:

  1. Create a create_model.json file. Within the file, specify the name of the model, the image URI, the path to the model.tar.gz file in your Amazon S3 bucket, and your SageMaker execution role:

    { "ModelName": "insert model name", "PrimaryContainer": { "Image": "insert the ECR Image URI", "ModelDataUrl": "insert S3 archive URL", "Environment": {"See details below"} }, "ExecutionRoleArn": "ARN for AmazonSageMaker-ExecutionRole" }

    If you trained your model using SageMaker, specify the following environment variable:

    "Environment": { "SAGEMAKER_SUBMIT_DIRECTORY" : "[Full S3 path for *.tar.gz file containing the training script]" }

    If you did not train your model using SageMaker, specify the following environment variables:

    MXNet and PyTorch
    "Environment": { "SAGEMAKER_PROGRAM": "inference.py", "SAGEMAKER_SUBMIT_DIRECTORY": "/opt/ml/model/code", "SAGEMAKER_CONTAINER_LOG_LEVEL": "20", "SAGEMAKER_REGION": "insert your region", "MMS_DEFAULT_RESPONSE_TIMEOUT": "500" }
    TensorFlow
    "Environment": { "SAGEMAKER_PROGRAM": "inference.py", "SAGEMAKER_SUBMIT_DIRECTORY": "/opt/ml/model/code", "SAGEMAKER_CONTAINER_LOG_LEVEL": "20", "SAGEMAKER_REGION": "insert your region" }
    Note

    The AmazonSageMakerFullAccess and AmazonS3ReadOnlyAccess policies must be attached to the AmazonSageMaker-ExecutionRole IAM role.

  2. Run the following command:

    aws sagemaker create-model --cli-input-json file://create_model.json

    For the full syntax of the create-model API, see create-model.

Create an Endpoint Configuration

After creating a SageMaker model, create the endpoint configuration using the create-endpoint-config API. To do this, create a JSON file with your endpoint configuration specifications. For example, you can use the following code template and save it as create_config.json:

{ "EndpointConfigName": "<provide your endpoint config name>", "ProductionVariants": [ { "VariantName": "<provide your variant name>", "ModelName": "my-sagemaker-model", "InitialInstanceCount": 1, "InstanceType": "<provide your instance type here>", "InitialVariantWeight": 1.0 } ] }

Now run the following Amazon CLI command to create your endpoint configuration:

aws sagemaker create-endpoint-config --cli-input-json file://create_config.json

For the full syntax of the create-endpoint-config API, see create-endpoint-config.

Create an Endpoint

After you have created your endpoint configuration, create an endpoint using the create-endpoint API:

aws sagemaker create-endpoint --endpoint-name '<provide your endpoint name>' --endpoint-config-name '<insert your endpoint config name>'

For the full syntax of the create-endpoint API, see create-endpoint.