Inference recommendations - Amazon SageMaker AI
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Inference recommendations

Inference recommendation jobs run a set of load tests on recommended instance types or a serverless endpoint. Inference recommendation jobs use performance metrics that are based on load tests using the sample data you provided during model version registration.

Note

Before you create an Inference Recommender recommendation job, make sure you have satisfied the Prerequisites for using Amazon SageMaker Inference Recommender.

The following demonstrates how to use Amazon SageMaker Inference Recommender to create an inference recommendation based on your model type using the Amazon SDK for Python (Boto3), Amazon CLI, and Amazon SageMaker Studio Classic, and the SageMaker AI console