ProductionVariantManagedInstanceScaling - Amazon SageMaker
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

ProductionVariantManagedInstanceScaling

Settings that control the range in the number of instances that the endpoint provisions as it scales up or down to accommodate traffic.

Contents

MaxInstanceCount

The maximum number of instances that the endpoint can provision when it scales up to accommodate an increase in traffic.

Type: Integer

Valid Range: Minimum value of 1.

Required: No

MinInstanceCount

The minimum number of instances that the endpoint must retain when it scales down to accommodate a decrease in traffic.

Type: Integer

Valid Range: Minimum value of 1.

Required: No

Status

Indicates whether managed instance scaling is enabled.

Type: String

Valid Values: ENABLED | DISABLED

Required: No

See Also

For more information about using this API in one of the language-specific Amazon SDKs, see the following: