Inference recommendations

Inference recommendation jobs run a set of load tests on recommended instance types or a serverless endpoint. Inference recommendation jobs use performance metrics that are based on load tests using the sample data you provided during model version registration.

Note

Before you create an Inference Recommender recommendation job, make sure you have satisfied the Prerequisites for using Amazon SageMaker Inference Recommender.

The following demonstrates how to use Amazon SageMaker Inference Recommender to create an inference recommendation based on your model type using the Amazon SDK for Python (Boto3), Amazon CLI, and Amazon SageMaker Studio Classic, and the SageMaker AI console

Topics

Create an inference recommendation
Get your inference recommendation job results

Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Get instant prospective instances

Create an inference recommendation