Advanced predictive scaling policy using custom metrics - Amazon EC2 Auto Scaling
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Advanced predictive scaling policy using custom metrics

In a predictive scaling policy, you can use predefined or custom metrics. Custom metrics are useful when the predefined metrics (CPU, network I/O, and Application Load Balancer request count) do not sufficiently describe your application load.

When creating a predictive scaling policy with custom metrics, you can specify other CloudWatch metrics provided by Amazon, or you can specify metrics that you define and publish yourself. You can also use metric math to aggregate and transform existing metrics into a new time series that Amazon doesn't automatically track. When you combine values in your data, for example, by calculating new sums or averages, it's called aggregating. The resulting data is called an aggregate.

The following section contains best practices and examples of how to construct the JSON structure for the policy.

Best practices

The following best practices can help you use custom metrics more effectively:

  • For the load metric specification, the most useful metric is a metric that represents the load on an Auto Scaling group as a whole, regardless of the group's capacity.

  • For the scaling metric specification, the most useful metric to scale by is an average throughput or utilization per instance metric.

  • The scaling metric must be inversely proportional to capacity. That is, if the number of instances in the Auto Scaling group increases, the scaling metric should decrease by roughly the same proportion. To ensure that predictive scaling behaves as expected, the load metric and the scaling metric must also correlate strongly with each other.

  • The target utilization must match the type of scaling metric. For a policy configuration that uses CPU utilization, this is a target percentage. For a policy configuration that uses throughput, such as the number of requests or messages, this is the target number of requests or messages per instance during any one-minute interval.

  • If these recommendations are not followed, the forecasted future values of the time series will probably be incorrect. To validate that the data is correct, you can view the forecasted values in the Amazon EC2 Auto Scaling console. Alternatively, after you create your predictive scaling policy, inspect the LoadForecast and CapacityForecast objects returned by a call to the GetPredictiveScalingForecast API.

  • We strongly recommend that you configure predictive scaling in forecast only mode so that you can evaluate the forecast before predictive scaling starts actively scaling capacity.

Prerequisites

To add custom metrics to your predictive scaling policy, you must have cloudwatch:GetMetricData permissions.

To specify your own metrics instead of the metrics that Amazon provides, you must first publish your metrics to CloudWatch. For more information, see Publishing custom metrics in the Amazon CloudWatch User Guide.

If you publish your own metrics, make sure to publish the data points at a minimum frequency of five minutes. Amazon EC2 Auto Scaling retrieves the data points from CloudWatch based on the length of the period that it needs. For example, the load metric specification uses hourly metrics to measure the load on your application. CloudWatch uses your published metric data to provide a single data value for any one-hour period by aggregating all data points with timestamps that fall within each one-hour period.

Limitations

  • You can query data points of up to 10 metrics in one metric specification.

  • For the purposes of this limit, one expression counts as one metric.