InferenceComponentCapacitySize - Amazon SageMaker
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

InferenceComponentCapacitySize

Specifies the type and size of the endpoint capacity to activate for a rolling deployment or a rollback strategy. You can specify your batches as either of the following:

  • A count of inference component copies

  • The overall percentage or your fleet

For a rollback strategy, if you don't specify the fields in this object, or if you set the Value parameter to 100%, then SageMaker AI uses a blue/green rollback strategy and rolls all traffic back to the blue fleet.

Contents

Type

Specifies the endpoint capacity type.

COPY_COUNT

The endpoint activates based on the number of inference component copies.

CAPACITY_PERCENT

The endpoint activates based on the specified percentage of capacity.

Type: String

Valid Values: COPY_COUNT | CAPACITY_PERCENT

Required: Yes

Value

Defines the capacity size, either as a number of inference component copies or a capacity percentage.

Type: Integer

Valid Range: Minimum value of 1.

Required: Yes

See Also

For more information about using this API in one of the language-specific Amazon SDKs, see the following: