RecommendationMetrics

The metrics of recommendations.

CostPerHour

Defines the cost per hour for the instance.

Type: Float

Required: No

CostPerInference

Defines the cost per inference for the instance .

Type: Float

Required: No

CpuUtilization

The expected CPU utilization at maximum invocations per minute for the instance.

NaN indicates that the value is not available.

Type: Float

Valid Range: Minimum value of 0.0.

Required: No

MaxInvocations

The expected maximum number of requests per minute for the instance.

Type: Integer

Required: No

MemoryUtilization

The expected memory utilization at maximum invocations per minute for the instance.

NaN indicates that the value is not available.

Type: Float

Valid Range: Minimum value of 0.0.

Required: No

ModelLatency

The expected model latency at maximum invocation per minute for the instance.

Type: Integer

Required: No

ModelSetupTime

The time it takes to launch new compute resources for a serverless endpoint. The time can vary depending on the model size, how long it takes to download the model, and the start-up time of the container.

NaN indicates that the value is not available.

Type: Integer

Valid Range: Minimum value of 0.

Required: No

RecommendationMetrics

Contents

See Also