Amazon Comprehend and Application Auto Scaling
You can scale Amazon Comprehend document classification and entity recognizer endpoints using target tracking scaling policies and scheduled scaling.
Use the following information to help you integrate Amazon Comprehend with Application Auto Scaling.
Service-linked role created for Amazon Comprehend
The following service-linked role is automatically created in your Amazon Web Services account when registering Amazon Comprehend resources as scalable targets with Application Auto Scaling. This role allows Application Auto Scaling to perform supported operations within your account. For more information, see Service-linked roles for Application Auto Scaling.
-
AWSServiceRoleForApplicationAutoScaling_ComprehendEndpoint
Service principal used by the service-linked role
The service-linked role in the previous section can be assumed only by the service principal authorized by the trust relationships defined for the role. The service-linked role used by Application Auto Scaling grants access to the following service principal:
-
comprehend.application-autoscaling.amazonaws.com
Registering Amazon Comprehend resources as scalable targets with Application Auto Scaling
Application Auto Scaling requires a scalable target before you can create scaling policies or scheduled actions for an Amazon Comprehend document classification or entity recognizer endpoint. A scalable target is a resource that Application Auto Scaling can scale out and scale in. Scalable targets are uniquely identified by the combination of resource ID, scalable dimension, and namespace.
To configure auto scaling using the Amazon CLI or one of the Amazon SDKs, you can use the following options:
-
Amazon CLI:
Call the register-scalable-target command for a document classification endpoint. The following example registers the desired number of inference units to be used by the model for a document classifier endpoint using the endpoint's ARN, with a minimum capacity of one inference unit and a maximum capacity of three inference units.
aws application-autoscaling register-scalable-target \ --service-namespace comprehend \ --scalable-dimension comprehend:document-classifier-endpoint:DesiredInferenceUnits \ --resource-id arn:aws-cn:comprehend:
us-west-2
:123456789012
:document-classifier-endpoint/EXAMPLE
\ --min-capacity1
\ --max-capacity3
If successful, this command returns the ARN of the scalable target.
{ "ScalableTargetARN": "arn:aws-cn:application-autoscaling:
region
:account-id
:scalable-target/1234abcd56ab78cd901ef1234567890ab123" }Call the
register-scalable-target
command for an entity recognizer endpoint. The following example registers the desired number of inference units to be used by the model for an entity recognizer using the endpoint's ARN, with a minimum capacity of one inference unit and a maximum capacity of three inference units.aws application-autoscaling register-scalable-target \ --service-namespace comprehend \ --scalable-dimension comprehend:entity-recognizer-endpoint:DesiredInferenceUnits \ --resource-id arn:aws-cn:comprehend:
us-west-2
:123456789012
:entity-recognizer-endpoint/EXAMPLE
\ --min-capacity1
\ --max-capacity3
If successful, this command returns the ARN of the scalable target.
{ "ScalableTargetARN": "arn:aws-cn:application-autoscaling:
region
:account-id
:scalable-target/1234abcd56ab78cd901ef1234567890ab123" } -
Amazon SDK:
Call the RegisterScalableTarget operation and provide
ResourceId
,ScalableDimension
,ServiceNamespace
,MinCapacity
, andMaxCapacity
as parameters.
Related resources
If you are just getting started with Application Auto Scaling, you can find additional useful information about scaling your Amazon Comprehend resources in the following documentation:
Auto scaling with endpoints in the Amazon Comprehend Developer Guide