Auto-scaling the number of replicas in an Amazon Neptune DB cluster - Amazon Neptune
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China.

Auto-scaling the number of replicas in an Amazon Neptune DB cluster

You can use Neptune auto-scaling to automatically adjust the number of Neptune replicas in a DB cluster, to meet your connectivity and workload requirements. Auto-scaling lets your Neptune DB cluster handle sudden increases in workload, and then, when the workload decreases, auto-scaling removes unnecessary replicas so you aren't paying for unused capacity.

You can only use auto-scaling with a Neptune DB cluster that already has one primary writer instance and at least one read-replica instance (see Amazon Neptune DB Clusters and Instances). Also, all read-replica instances in the cluster must be in an available state. If any read-replica is in a state other than available, Neptune autoscaling does nothing until every read-replica in the cluster is available.

See Create a DB cluster if you need to create a new cluster.

Using the Amazon CLI, you define and apply a scaling policy to the DB cluster. The scaling policy specifies the following auto-scaling parameters:

  • The minimum and maximum number of replicas to have in the cluster.

  • A so-called cool-down interval between replica additions or deletions.

  • The CloudWatch metric and the metric trigger value for scaling up or down.

Using the Amazon CLI, you can also edit or delete your auto-scaling policy.

When the CloudWatch metric you're using reaches the threshold you specified in your policy, as long as your DB cluster doesn't already have the maximum number of replicas that you set, Neptune creates a new replica using the same instance type as the DB cluster's primary instance.

When the metric shows that the workload has decreased sufficiently, Neptune auto-scaling removes unnecessary replicas so that you don't pay for capacity you aren't using.

Note

Neptune auto-scaling only removes replicas that it created. It does not remove pre-existing replicas.

Using the neptune_autoscaling_config DB cluster parameter, you can also specify the instance type of the new read-replicas that Neptune auto-scaling creates, the maintenance windows for those read-replicas, and tags to be associated with each of the new read-replicas. You provide these configuration settings in a JSON string as the value of the neptune_autoscaling_config parameter, like this:

"{ \"tags\": [ { \"key\" : \"reader tag-0 key\", \"value\" : \"reader tag-0 value\" }, { \"key\" : \"reader tag-1 key\", \"value\" : \"reader tag-1 value\" }, ], \"maintenanceWindow\" : \"wed:12:03-wed:12:33\", \"dbInstanceClass\" : \"db.r5.xlarge\" }"

Note that the quotation marks in the JSON string must all be escaped with a backslash character (\). All whitespace in the string is optional, as usual.

Any of the three configuration settings not specified in the neptune_autoscaling_config parameter are copied from the configuration of the DB cluster's primary writer instance.

When auto-scaling adds a new read-replica instance, it prefixes the DB instance ID with autoscaled-reader (for example, autoscaled-reader-7r7t7z3lbd-20210828). It also adds the a tag to every read-replica that it creates with the key autoscaled-reader and a value of TRUE. You can see this tag on the Tags tab of the DB instance detail page in the Amazon Web Services Management Console.

"key" : "autoscaled-reader", "value" : "TRUE"

The promotion tier of all the read-replica instances created by auto-scaling is the lowest priority, which is 15 by default. This means that during a failover, any replica a higher priority, such as one that was created manually, would be promoted first. See Fault tolerance for a Neptune DB cluster.

Neptune auto-scaling is implemented using Application Auto Scaling with a target tracking scaling policy that uses a Neptune CPUUtilization CloudWatch metric as a predefined metric.

How to enable auto-scaling for Amazon Neptune

Enabling auto-scaling for a Neptune DB cluster involves three steps:

1. Register your DB cluster with Application Auto Scaling

The first step in enabling auto-scaling for a Neptune DB cluster is to register the cluster with Application Auto Scaling, using the Amazon CLI or one of the Application Auto Scaling SDKs. The cluster must already have one primary instance and at least one read-replica instance:

For example, to register a cluster to be auto-scaled with from one to eight additional replicas, you could use the Amazon CLI register-scalable-target command as follows:

aws application-autoscaling register-scalable-target \ --service-namespace neptune \ --resource-id cluster:(your DB cluster name) \ --scalable-dimension neptune:cluster:ReadReplicaCount \ --min-capacity 1 \ --max-capacity 8

This is equivalent to using the the RegisterScalableTarget Application Auto Scaling API operation.

The Amazon CLI register-scalable-target command takes the following parameters:

  • --service-namespace   –   Set to neptune.

    This parameter is equivalent to the ServiceNamespace parameter in the Application Auto Scaling API.

  • resource-id   –   Set this to the resource identifier for your Neptune DB cluster. The resource type is cluster, which is followed by a colon (':'), and then the name of your DB cluster.

    This parameter is equivalent to the ResourceID parameter in the Application Auto Scaling API.

  • scalable-dimension   –   The scalable dimension in this case is the number of replica instances in the DB cluster, so you set this parameter to neptune:cluster:ReadReplicaCount.

    This parameter is equivalent to the ScalableDimension parameter in the Application Auto Scaling API.

  • min-capacity   –   The minimum number of reader DB replica instances to be managed by Application Auto Scaling. This can be as little as 0 and no larger than the value of the max-capacity parameter.

    This parameter is equivalent to the MinCapacity parameter in the Application Auto Scaling API.

  • max-capacity   –   The maximum number of reader DB replica instances to be managed by Application Auto Scaling. This value can be no smaller than the value of the min-capacity parameter, and no larger than 15 minus the number of pre-existing replicas. A DB cluster can't have more than 15 replicas in total, so the maximum number of replicas managed by auto-scaling plus the number of pre-existing replicas can't exceed that limit.

    The max-capacity Amazon CLI parameter is equivalent to the MaxCapacity parameter in the Application Auto Scaling API.

When you register your DB cluster, Application Auto Scaling creates an AWSServiceRoleForApplicationAutoScaling_NeptuneCluster service-linked role. For more information, see Service-linked roles for Application auto-scaling in the Application Auto Scaling User Guide.

2. Define an autoscaling policy to use with your DB cluster

A target-tracking scaling policy is defined as a JSON text object that can also be saved in a text file. For Neptune this policy currently can only use the Neptune CPUUtilization CloudWatch metric as a predefined metric named NeptuneReaderAverageCPUUtilization.

Here is an example target tracking scaling configuration policy for Neptune:

{ "PredefinedMetricSpecification": { "PredefinedMetricType": "NeptuneReaderAverageCPUUtilization" }, "TargetValue": 60.0, "ScaleOutCooldown" : 600, "ScaleInCooldown" : 600 }

The TargetValue element here contains the percentage of CPU utilization above which auto-scaling scales out (that is, adds more replicas) and below which it scales in (that is, deletes replicas). In this case, the target percentage that triggers scaling is 60.0%.

The ScaleInCooldown element specifies the amount of time, in seconds, after a scale-in activity completes before another scale-in can start. The default is 300 seconds. Here, the value of 600 specifies that at least ten minutes must elapse between the completion of one replica deletion and the start of another one.

The ScaleOutCooldown element specifies the amount of time, in seconds, after a scale-out activity completes before another scale-out can start. The default is 300 seconds. Here, the value of 600 specifies that at least ten minutes must elapse between the completion of one replica addition and the start of another one.

The DisableScaleIn element is a Boolean that if present and set to true disables scale-in entirely, meaning that auto-scaling may add replicas but will never remove any. By default, scale-in is enabled, and DisableScaleIn is false.

After registering your Neptune DB cluster with Application Auto Scaling and defining a JSON scaling policy in a text file, next apply the scaling policy to the registered DB cluster. You can use the Amazon CLI put-scaling-policy command to do this, with parameters like the following:

aws application-autoscaling put-scaling-policy \ --policy-name (name of the scaling policy) \ --policy-type TargetTrackingScaling \ --resource-id cluster:(name of your Neptune DB cluster) \ --service-namespace neptune \ --scalable-dimension neptune:cluster:ReadReplicaCount --target-tracking-scaling-policy-configuration file://(path to the JSON configuration file)

When you have applied the auto-scaling policy, auto-scaling is enabled on your DB cluster.

You can also use the Amazon CLI put-scaling-policy command to update an existing auto-scaling policy.

See also PutScalingPolicy in the Application Auto Scaling API Reference.

Removing auto-scaling from a Neptune DB cluster

To remove auto-scaling from a Neptune DB cluster, use the Amazon CLI delete-scaling-policy and deregister-scalable-target commands.