Invoke a multi-container endpoint with direct invocation - Amazon SageMaker AI
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Invoke a multi-container endpoint with direct invocation

SageMaker AI multi-container endpoints enable customers to deploy multiple containers to deploy different models on a SageMaker AI endpoint. You can host up to 15 different inference containers on a single endpoint. By using direct invocation, you can send a request to a specific inference container hosted on a multi-container endpoint.

To invoke a multi-container endpoint with direct invocation, call invoke_endpoint as you would invoke any other endpoint, and specify which container you want to invoke by using the TargetContainerHostname parameter.

The following example directly invokes the secondContainer of a multi-container endpoint to get a prediction.

import boto3 runtime_sm_client = boto3.Session().client('sagemaker-runtime') response = runtime_sm_client.invoke_endpoint( EndpointName ='my-endpoint', ContentType = 'text/csv', TargetContainerHostname='secondContainer', Body = body)

For each direct invocation request to a multi-container endpoint, only the container with the TargetContainerHostname processes the invocation request. You will get validation errors if you do any of the following:

  • Specify a TargetContainerHostname that does not exist in the endpoint

  • Do not specify a value for TargetContainerHostname in a request to an endpoint configured for direct invocation

  • Specify a value for TargetContainerHostname in a request to an endpoint that is not configured for direct invocation.