You are viewing documentation for version 2 of the AWS SDK for Ruby. Version 3 documentation can be found here.

Class: Aws::SageMaker::Types::ProductionVariant

Inherits:

Struct

Object
Struct
Aws::SageMaker::Types::ProductionVariant

show all

Defined in:: (unknown)

Overview

Note:

When passing ProductionVariant as input to an Aws::Client method, you can use a vanilla Hash:

{
  variant_name: "VariantName", # required
  model_name: "ModelName", # required
  initial_instance_count: 1, # required
  instance_type: "ml.t2.medium", # required, accepts ml.t2.medium, ml.t2.large, ml.t2.xlarge, ml.t2.2xlarge, ml.m4.xlarge, ml.m4.2xlarge, ml.m4.4xlarge, ml.m4.10xlarge, ml.m4.16xlarge, ml.m5.large, ml.m5.xlarge, ml.m5.2xlarge, ml.m5.4xlarge, ml.m5.12xlarge, ml.m5.24xlarge, ml.m5d.large, ml.m5d.xlarge, ml.m5d.2xlarge, ml.m5d.4xlarge, ml.m5d.12xlarge, ml.m5d.24xlarge, ml.c4.large, ml.c4.xlarge, ml.c4.2xlarge, ml.c4.4xlarge, ml.c4.8xlarge, ml.p2.xlarge, ml.p2.8xlarge, ml.p2.16xlarge, ml.p3.2xlarge, ml.p3.8xlarge, ml.p3.16xlarge, ml.c5.large, ml.c5.xlarge, ml.c5.2xlarge, ml.c5.4xlarge, ml.c5.9xlarge, ml.c5.18xlarge, ml.c5d.large, ml.c5d.xlarge, ml.c5d.2xlarge, ml.c5d.4xlarge, ml.c5d.9xlarge, ml.c5d.18xlarge, ml.g4dn.xlarge, ml.g4dn.2xlarge, ml.g4dn.4xlarge, ml.g4dn.8xlarge, ml.g4dn.12xlarge, ml.g4dn.16xlarge, ml.r5.large, ml.r5.xlarge, ml.r5.2xlarge, ml.r5.4xlarge, ml.r5.12xlarge, ml.r5.24xlarge, ml.r5d.large, ml.r5d.xlarge, ml.r5d.2xlarge, ml.r5d.4xlarge, ml.r5d.12xlarge, ml.r5d.24xlarge, ml.inf1.xlarge, ml.inf1.2xlarge, ml.inf1.6xlarge, ml.inf1.24xlarge
  initial_variant_weight: 1.0,
  accelerator_type: "ml.eia1.medium", # accepts ml.eia1.medium, ml.eia1.large, ml.eia1.xlarge, ml.eia2.medium, ml.eia2.large, ml.eia2.xlarge
}

Identifies a model that you want to host and the resources to deploy for hosting it. If you are deploying multiple models, tell Amazon SageMaker how to distribute traffic among the models by specifying variant weights.

Instance Attribute Summary collapse

#accelerator_type ⇒ String
The size of the Elastic Inference (EI) instance to use for the production variant.
#initial_instance_count ⇒ Integer
Number of instances to launch initially.
#initial_variant_weight ⇒ Float
Determines initial traffic distribution among all of the models that you specify in the endpoint configuration.
#instance_type ⇒ String
The ML compute instance type.
#model_name ⇒ String
The name of the model that you want to host.
#variant_name ⇒ String
The name of the production variant.

Instance Attribute Details

#accelerator_type ⇒ `String`

The size of the Elastic Inference (EI) instance to use for the production variant. EI instances provide on-demand GPU computing for inference. For more information, see Using Elastic Inference in Amazon SageMaker.

Returns:

(String) —
The size of the Elastic Inference (EI) instance to use for the production variant.

#initial_instance_count ⇒ `Integer`

Number of instances to launch initially.

Returns:

(Integer) —
Number of instances to launch initially.

#initial_variant_weight ⇒ `Float`

Determines initial traffic distribution among all of the models that you specify in the endpoint configuration. The traffic to a production variant is determined by the ratio of the VariantWeight to the sum of all VariantWeight values across all ProductionVariants. If unspecified, it defaults to 1.0.

Returns:

(Float) —
Determines initial traffic distribution among all of the models that you specify in the endpoint configuration.

#instance_type ⇒ `String`

The ML compute instance type.

Possible values:

ml.t2.medium
ml.t2.large
ml.t2.xlarge
ml.t2.2xlarge
ml.m4.xlarge
ml.m4.2xlarge
ml.m4.4xlarge
ml.m4.10xlarge
ml.m4.16xlarge
ml.m5.large
ml.m5.xlarge
ml.m5.2xlarge
ml.m5.4xlarge
ml.m5.12xlarge
ml.m5.24xlarge
ml.m5d.large
ml.m5d.xlarge
ml.m5d.2xlarge
ml.m5d.4xlarge
ml.m5d.12xlarge
ml.m5d.24xlarge
ml.c4.large
ml.c4.xlarge
ml.c4.2xlarge
ml.c4.4xlarge
ml.c4.8xlarge
ml.p2.xlarge
ml.p2.8xlarge
ml.p2.16xlarge
ml.p3.2xlarge
ml.p3.8xlarge
ml.p3.16xlarge
ml.c5.large
ml.c5.xlarge
ml.c5.2xlarge
ml.c5.4xlarge
ml.c5.9xlarge
ml.c5.18xlarge
ml.c5d.large
ml.c5d.xlarge
ml.c5d.2xlarge
ml.c5d.4xlarge
ml.c5d.9xlarge
ml.c5d.18xlarge
ml.g4dn.xlarge
ml.g4dn.2xlarge
ml.g4dn.4xlarge
ml.g4dn.8xlarge
ml.g4dn.12xlarge
ml.g4dn.16xlarge
ml.r5.large
ml.r5.xlarge
ml.r5.2xlarge
ml.r5.4xlarge
ml.r5.12xlarge
ml.r5.24xlarge
ml.r5d.large
ml.r5d.xlarge
ml.r5d.2xlarge
ml.r5d.4xlarge
ml.r5d.12xlarge
ml.r5d.24xlarge
ml.inf1.xlarge
ml.inf1.2xlarge
ml.inf1.6xlarge
ml.inf1.24xlarge

Returns:

(String) —
The ML compute instance type.

#model_name ⇒ `String`

The name of the model that you want to host. This is the name that you specified when creating the model.

Returns:

(String) —
The name of the model that you want to host.

#variant_name ⇒ `String`

The name of the production variant.

Returns:

(String) —
The name of the production variant.

Class: Aws::SageMaker::Types::ProductionVariant

Overview

Instance Attribute Summary collapse

Instance Attribute Details

#accelerator_type ⇒ String

#initial_instance_count ⇒ Integer

#initial_variant_weight ⇒ Float

#instance_type ⇒ String

#model_name ⇒ String

#variant_name ⇒ String

#accelerator_type ⇒ `String`

#initial_instance_count ⇒ `Integer`

#initial_variant_weight ⇒ `Float`

#instance_type ⇒ `String`

#model_name ⇒ `String`

#variant_name ⇒ `String`