InferenceMetrics - Amazon SageMaker
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

InferenceMetrics

The metrics for an existing endpoint compared in an Inference Recommender job.

Contents

MaxInvocations

The expected maximum number of requests per minute for the instance.

Type: Integer

Required: Yes

ModelLatency

The expected model latency at maximum invocations per minute for the instance.

Type: Integer

Required: Yes

See Also

For more information about using this API in one of the language-specific Amazon SDKs, see the following: