Control the instances Amazon ECS terminates - Amazon Elastic Container Service
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Control the instances Amazon ECS terminates

Important

You must turn on Auto Scaling instance scale-in protection on the Auto Scaling group to use the managed termination protection feature of cluster auto scaling.

Managed termination protection allows cluster auto scaling to control which instances are terminated. When you used managed termination protection, Amazon ECS only terminates EC2 instances that don't have any running Amazon ECS tasks. Tasks that are run by a service that uses the DAEMON scheduling strategy are ignored and an instance can be terminated by cluster auto scaling even when the instance is running these tasks. This is because all of the instances in the cluster are running these tasks.

Amazon ECS first turns on the instance scale-in protection option for the EC2 instances in the Auto Scaling group. Then, Amazon ECS places the tasks on the instances. When all non-daemon tasks are stopped on an instance, Amazon ECS initiates the scale-in process and turns off scale-in protection for the EC2 instance. The Auto Scaling group can then terminate the instance.

Auto Scaling instance scale-in protection controls which EC2 instances can be terminated by Auto Scaling. Instances with the scale-in feature turned on can't be terminated during the scale-in process. For more information about Auto Scaling instance scale-in protection, see Using instance scale-in protection in the Amazon EC2 Auto Scaling User Guide.

You can set the targetCapacity percentage so that you have spare capacity. This helps future tasks launch more quickly because the Auto Scaling group does not have to launch more instances. Amazon ECS uses the target capacity value to manage the CloudWatch metric that the service creates. Amazon ECS manages the CloudWatch metric. The Auto Scaling group is treated as a steady state so that no scaling action is required. The values can be from 0-100%. For example, to configure Amazon ECS to keep 10% free capacity on top of that used by Amazon ECS tasks, set the target capacity value to 90%. Consider the following when setting the targetCapacity value on a capacity provider.

  • A targetCapacity value of less than 100% represents the amount of free capacity (Amazon EC2 instances) that need to be present in the cluster. Free capacity means that there are no running tasks.

  • Placement constraints such as Availability Zones, without additional binpack forces Amazon ECS to eventually run one task for each instance, which might not be the desired behavior.

You must turn on Auto Scaling instance scale-in protection on the Auto Scaling group to use managed termination protection. If you don't turn on scale-in protection, then turning on managed termination protection can lead to undesirable behavior. For example, you may have instances stuck in draining state. For more information, see Using instance scale-in protection in the Amazon EC2 Auto Scaling User Guide.

When you use termination protection with a capacity provider, don't perform any manual actions, like detaching the instance, on the Auto Scaling group associated with the capacity provider. Manual actions can break the scale-in operation of the capacity provider. If you detach an instance from the Auto Scaling group, you need to also deregister the detached instance from the Amazon ECS cluster.

Managed scale-out behavior

When you have Auto Scaling group capacity providers that use managed scaling, Amazon ECS estimates the optimal number of instances to add to your cluster and uses the value to determine how many instances to request.

Amazon ECS selects a capacity provider for each task by following the capacity provider strategy from the service, standalone task, or the cluster default. Amazon ECS follows the rest of these steps for a single capacity provider.

Tasks without a capacity provider strategy are ignored by capacity providers. A pending task that doesn't have a capacity provider strategy won't cause any capacity provider to scale out. Tasks or services can't set a capacity provider strategy if that task or service sets a launch type.

The following describes the scale-out behavior in more detail.

  • Group all of the provisioning tasks for this capacity provider so that each group has the same exact resource requirements.

  • When you use multiple instance types in an Auto Scaling group, the instance types in the Auto Scaling group are sorted by their parameters. These parameters include vCPU, memory, elastic network interfaces (ENIs), ports, and GPUs. The smallest and the largest instance types for each parameter are selected. For more information about how to choose the instance type, see Amazon EC2 container instances for Amazon ECS.

    Important

    If a group of tasks have resource requirements that are greater than the smallest instance type in the Auto Scaling group, then that group of tasks can’t run with this capacity provider. The capacity provider doesn’t scale the Auto Scaling group. The tasks remain in the PROVISIONING state.

    To prevent tasks from staying in the PROVISIONING state, we recommend that you create separate Auto Scaling groups and capacity providers for different minimum resource requirements. When you run tasks or create services, only add capacity providers to the capacity provider strategy that can run the task on the smallest instance type in the Auto Scaling group. For other parameters, you can use placement constraints

  • For each group of tasks, Amazon ECS calculates the number of instances that are required to run the unplaced tasks. This calculation uses a binpack strategy. This strategy accounts for the vCPU, memory, elastic network interfaces (ENI), ports, and GPUs requirements of the tasks. It also accounts for the resource availability of the Amazon EC2 instances. The values for the largest instance types are treated as the maximum calculated instance count. The values for the smallest instance type are used as protection. If the smallest instance type can't run at least one instance of the task, the calculation considers the task as not compatible. As a result, the task is excluded from scale-out calculation. When all the tasks aren't compatible with the smallest instance type, cluster auto scaling stops and the CapacityProviderReservation value remains at the targetCapacity value.

  • Amazon ECS publishes the CapacityProviderReservation metric to CloudWatch with respect to the minimumScalingStepSize if either of the following is the case.

    • The maximum calculated instance count is less than the minimum scaling step size.

    • The lower value of either the maximumScalingStepSize or the maximum calculated instance count.

  • CloudWatch alarms use the CapacityProviderReservation metric for capacity providers. When the CapacityProviderReservation metric is greater than the targetCapacity value, alarms also increase the DesiredCapacity of the Auto Scaling group. The targetCapacity value is a capacity provider setting that's sent to the CloudWatch alarm during the cluster auto scaling activation phase.

    The default targetCapacity is 100%.

  • The Auto Scaling group launches additional EC2 instances. To prevent over-provisioning, Auto Scaling makes sure that recently launched EC2 instance capacity is stabilized before it launches new instances. Auto Scaling checks if all existing instances have passed the instanceWarmupPeriod (now minus the instance launch time). The scale-out is blocked for instances that are within the instanceWarmupPeriod.

    The default number of seconds for a newly launched instance to warm up is 300.

For more information, see Deep dive on Amazon ECS cluster auto scaling.

Scale-out considerations

Consider the following for the scale-out process:

  • Although there are multiple placement constraints, we recommend that you only use the distinctInstance task placement constraint. This prevents the scale-out process from stopping because you're using a placement constraint that's not compatible with the sampled instances.

  • Managed scaling works best if your Auto Scaling group uses the same or similar instance types.

  • When a scale-out process is required and there are no currently running container instances, Amazon ECS always scales-out to two instances initially, and then performs additional scale-out or scale-in processes. Any additional scale-out waits for the instance warmup period. For scale-in processes, Amazon ECS waits 15 minutes after a scale-out process before starting scale-in processes at all times.

  • The second scale-out step needs to wait until the instanceWarmupPeriod expires, which might affect the overall scale limit. If you need to reduce this time, make sure that instanceWarmupPeriod is large enough for the EC2 instance to launch and start the Amazon ECS agent (which prevents over provisioning).

  • Cluster auto scaling supports Launch Configuration, Launch Templates, and multiple instance types in the capacity provider Auto Scaling group. You can also use attribute-based instance type selection without multiple instances types.

  • When using an Auto Scaling group with On-Demand instances and multiple instance types or Spot Instances, place the larger instance types higher in the priority list and don't specify a weight. Specifying a weight isn't supported at this time. For more information, see Auto Scaling groups with multiple instance types in the Amazon Auto Scaling User Guide.

  • Amazon ECS then launch either the minimumScalingStepSize, if the maximum calculated instance count is less than the minimum scaling step size, or the lower of either the maximumScalingStepSize or the maximum calculated instance count value.

  • If an Amazon ECS service or run-task launches a task and the capacity provider container instances don't have enough resources to start the task, then Amazon ECS limits the number of tasks with this status for each cluster and prevents any tasks from exceeding this limit. For more information, see Amazon ECS service quotas.

Managed scale-in behavior

Amazon ECS monitors container instances for each capacity provider within a cluster. When a container instance isn't running any tasks, the container instance is considered empty and Amazon ECS starts the scale-in process.

CloudWatch scale-in alarms require 15 data points (15 minutes) before the scale-in process for the Auto Scaling group starts. After the scale-in process starts until Amazon ECS needs to reduce the number of registered container instances, the Auto Scaling group sets the DesireCapacity value to be greater than one instance and less than 50% each minute.

When Amazon ECS requests a scale-out (when CapacityProviderReservation is greater than 100) while a scale-in process is in progress, the scale-in process is stopped and starts from the beginning if required.

The following describes the scale-in behavior in more detail:

  1. Amazon ECS calculates the number of container instances that are empty. A container instance is considered empty even when daemon tasks are running.

  2. Amazon ECS sets the CapacityProviderReservation value to a number between 0-100 that uses the following formula to represent the ratio of how big the Auto Scaling group needs to be relative to how big it actually is, expressed as a percentage. Then, Amazon ECS publishes the metric to CloudWatch. For more information about how the metric is calculated, see Deep Dive on Amazon ECS Cluster Auto Scaling

    CapacityProviderReservation = (number of instances needed) / (number of running instances) x 100
  3. The CapacityProviderReservation metric generates a CloudWatch alarm. This alarm updates the DesiredCapacity value for the Auto Scaling group. Then, one of the following actions occurs:

    • If you don't use capacity provider managed termination, the Auto Scaling group selects EC2 instances using the Auto Scaling group termination policy and terminates the instances until the number of EC2 instances reaches the DesiredCapacity. The container instances are then deregistered from the cluster.

    • If all the container instances use managed termination protection, Amazon ECS removes the scale-in protection on the container instances that are empty. The Auto Scaling group will then be able to terminate the EC2 instances. The container instances are then deregistered from the cluster.