Class InvocationsScalingProps
(experimental) Properties for enabling SageMaker Endpoint utilization tracking.
Inheritance
Namespace: Amazon.CDK.AWS.Sagemaker.Alpha
Assembly: Amazon.CDK.AWS.Sagemaker.Alpha.dll
Syntax (csharp)
public class InvocationsScalingProps : Object, IInvocationsScalingProps, IBaseTargetTrackingProps
Syntax (vb)
Public Class InvocationsScalingProps
Inherits Object
Implements IInvocationsScalingProps, IBaseTargetTrackingProps
Remarks
Stability: Experimental
ExampleMetadata: infused
Examples
using Amazon.CDK.AWS.Sagemaker.Alpha;
Model model;
var variantName = "my-variant";
var endpointConfig = new EndpointConfig(this, "EndpointConfig", new EndpointConfigProps {
InstanceProductionVariants = new [] { new InstanceProductionVariantProps {
Model = model,
VariantName = variantName
} }
});
var endpoint = new Endpoint(this, "Endpoint", new EndpointProps { EndpointConfig = endpointConfig });
var productionVariant = endpoint.FindInstanceProductionVariant(variantName);
var instanceCount = productionVariant.AutoScaleInstanceCount(new EnableScalingProps {
MaxCapacity = 3
});
instanceCount.ScaleOnInvocations("LimitRPS", new InvocationsScalingProps {
MaxRequestsPerSecond = 30
});
Synopsis
Constructors
InvocationsScalingProps() |
Properties
DisableScaleIn | Indicates whether scale in by the target tracking policy is disabled. |
MaxRequestsPerSecond | (experimental) Max RPS per instance used for calculating the target SageMaker variant invocation per instance. |
PolicyName | A name for the scaling policy. |
SafetyFactor | (experimental) Safty factor for calculating the target SageMaker variant invocation per instance. |
ScaleInCooldown | Period after a scale in activity completes before another scale in activity can start. |
ScaleOutCooldown | Period after a scale out activity completes before another scale out activity can start. |
Constructors
InvocationsScalingProps()
public InvocationsScalingProps()
Properties
DisableScaleIn
Indicates whether scale in by the target tracking policy is disabled.
public Nullable<bool> DisableScaleIn { get; set; }
Property Value
System.Nullable<System.Boolean>
Remarks
If the value is true, scale in is disabled and the target tracking policy won't remove capacity from the scalable resource. Otherwise, scale in is enabled and the target tracking policy can remove capacity from the scalable resource.
Default: false
MaxRequestsPerSecond
(experimental) Max RPS per instance used for calculating the target SageMaker variant invocation per instance.
public double MaxRequestsPerSecond { get; set; }
Property Value
System.Double
Remarks
More documentation available here: https://docs.aws.amazon.com/sagemaker/latest/dg/endpoint-scaling-loadtest.html
Stability: Experimental
PolicyName
A name for the scaling policy.
public string PolicyName { get; set; }
Property Value
System.String
Remarks
Default: - Automatically generated name.
SafetyFactor
(experimental) Safty factor for calculating the target SageMaker variant invocation per instance.
public Nullable<double> SafetyFactor { get; set; }
Property Value
System.Nullable<System.Double>
Remarks
More documentation available here: https://docs.aws.amazon.com/sagemaker/latest/dg/endpoint-scaling-loadtest.html
Default: 0.5
Stability: Experimental
ScaleInCooldown
Period after a scale in activity completes before another scale in activity can start.
public Duration ScaleInCooldown { get; set; }
Property Value
Remarks
Default: Duration.seconds(300) for the following scalable targets: ECS services, Spot Fleet requests, EMR clusters, AppStream 2.0 fleets, Aurora DB clusters, Amazon SageMaker endpoint variants, Custom resources. For all other scalable targets, the default value is Duration.seconds(0): DynamoDB tables, DynamoDB global secondary indexes, Amazon Comprehend document classification endpoints, Lambda provisioned concurrency
ScaleOutCooldown
Period after a scale out activity completes before another scale out activity can start.
public Duration ScaleOutCooldown { get; set; }
Property Value
Remarks
Default: Duration.seconds(300) for the following scalable targets: ECS services, Spot Fleet requests, EMR clusters, AppStream 2.0 fleets, Aurora DB clusters, Amazon SageMaker endpoint variants, Custom resources. For all other scalable targets, the default value is Duration.seconds(0): DynamoDB tables, DynamoDB global secondary indexes, Amazon Comprehend document classification endpoints, Lambda provisioned concurrency