Handle Amazon ECS service quotas and API throttling limits - Amazon Elastic Container Service
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Handle Amazon ECS service quotas and API throttling limits

Amazon ECS is integrated with several Amazon Web Services services, including Elastic Load Balancing, Amazon Cloud Map, and Amazon EC2. With this tight integration, Amazon ECS includes several features such as service load balancing, Service Connect, task networking, and cluster auto scaling. Amazon ECS and the other Amazon Web Services services that it integrates with all maintain service quotas and API rate limits to ensure consistent performance and utilization. These service quotas also prevent the accidental provisioning of more resources than needed and protect against malicious actions that might increase your bill.

By familiarizing yourself with your service quotas and the Amazon API rate limits, you can plan for scaling your workloads without worrying about unexpected performance degradation. For more information, see Request throttling for the Amazon ECS API.

When scaling your workloads on Amazon ECS, we recommend that you consider the following service quota.

  • Amazon Fargate has quotas that limit the number of concurrent running tasks in each Amazon Web Services Region. There are quotas for both On-Demand and Fargate Spot tasks on Amazon ECS. Each service quota also includes any Amazon EKS pods that you run on Fargate.

  • For tasks that run on Amazon EC2 instances, the maximum number of Amazon EC2 instances that you can register for each cluster is 5,000. If you use Amazon ECS cluster auto scaling with an Auto Scaling group capacity provider, or if you manage Amazon EC2 instances for your cluster on your own, this quota might become a deployment bottleneck. If you require more capacity, you can create more clusters or request a service quota increase.

  • If you use Amazon ECS cluster auto scaling with an Auto Scaling group capacity provider, when scaling your services consider the Tasks in the PROVISIONING state per cluster quota. This quota is the maximum number of tasks in the PROVISIONING state for each cluster for which capacity providers can increase capacity. When you launch a large number of tasks all at the same time, you can easily meet this quota. One example is if you simultaneously deploy tens of services, each with hundreds of tasks. When this happens, the capacity provider needs to launch new container instances to place the tasks when the cluster has insufficient capacity. While the capacity provider is launching additional Amazon EC2 instances, the Amazon ECS service scheduler likely will continue to launch tasks in parallel. However, this activity might be throttled because of insufficient cluster capacity. The Amazon ECS service scheduler implements a back-off and exponential throttling strategy for retrying task placement as new container instances are launched. As a result, you might experience slower deployment or scale-out times. To avoid this situation, you can plan your service deployments in one of the following ones. Either deploy a large number of tasks don't require increasing cluster capacity, or keep spare cluster capacity for new task launches.

In addition to considering Amazon ECS service quota when scaling your workloads, consider also the service quota for the other Amazon Web Services services that are integrated with Amazon ECS.

Elastic Load Balancing

You can configure your Amazon ECS services to use Elastic Load Balancing to distribute traffic evenly across the tasks. For more information and recommended best practices for how to choose a load balancer, see Use load balancing to distribute Amazon ECS service traffic.

Elastic Load Balancing service quotas

When you scale your workloads, consider the following Elastic Load Balancing service quotas. Most Elastic Load Balancing service quotas are adjustable, and you can request an increase in the Service Quotas console.

Application Load Balancer

When you use an Application Load Balancer, depending on your use case, you might need to request a quota increase for:

  • The Targets per Application Load Balancer quota which is the number of targets behind your Application Load Balancer.

  • The Targets per Target Group per Region quota which is the number of targets behind your Target Groups.

For more information, see Quotas for your Application Load Balancers in User Guide for Application Load Balancers.

Network Load Balancer

There are stricter limitations on the number of targets you can register with a Network Load Balancer. When using a Network Load Balancer, you often will want to enable cross-zone support, which comes with additional scaling limitations on Targets per Availability Zone Per Network Load Balancer the maximum number of targets per Availability Zone for each Network Load Balancer. For more information, see Quotas for your Network Load Balancers iin the User Guide for Network Load Balancers.

Elastic Load Balancing API throttling

When you configure an Amazon ECS service to use a load balancer, the target group health checks must pass before the service is considered healthy. For performing these health checks, Amazon ECS invokes Elastic Load Balancing API operations on your behalf. If you have a large number of services configured with load balancers in your account, you might slower service deployments because of potential throttling specifically for the RegisterTarget, DeregisterTarget, and DescribeTargetHealth Elastic Load Balancing API operations. When throttling occurs, throttling errors occur in your Amazon ECS service event messages.

If you experience Amazon Cloud Map API throttling, you can contact Amazon Web Services Support for guidance on how to increase your Amazon Cloud Map API throttling limits. For more information about monitoring and troubleshooting such throttling errors, see Handle Amazon ECS throttling issues.

Elastic network interfaces

With your tasks use the awsvpc network mode, Amazon ECS provisions a unique elastic network interface (ENI) for each task. When your Amazon ECS services use an Elastic Load Balancing load balancer, these network interfaces are also registered as targets to the appropriate target group defined in the service.

Elastic network interface service quotas

When you run tasks that use the awsvpc network mode, a unique elastic network interface is attached to each task. If those tasks must be reached over the internet, assign a public IP address to the elastic network interface for those tasks. When you scale your Amazon ECS workloads, consider these two important quotas:

  • The Network interfaces per Region quota which is the maximum number of network interfaces in an Amazon Web Services Region for your account.

  • The Elastic IP addresses per Region quota which is the maximum number of elastic IP addresses in an Amazon Web Services Region.

Both of these service quotas are adjustable and you can request an increase from your Service Quotas console for these. For more information, see Amazon VPC service quotas in the Amazon Virtual Private Cloud user Guide.

For Amazon ECS workloads that are hosted on Amazon EC2 instances, when running tasks that use the awsvpc network mode consider the Maximum network interfaces service quota, the maximum number of network instances for each Amazon EC2 instance. This quota limits the number of tasks that you can place on an instance. You cannot adjust the quota and it's not available in the Service Quotas console. For more information, see IP addresses per network interface per instance type in the Amazon EC2 User Guide.

Although you can't change the number of network interfaces that can be attached to an Amazon EC2 instance, you can use the elastic network interface trunking feature to increase the number of available network interfaces. For example, by default a c5.large instance can have up to three network interfaces. The primary network interface for the instance counts as one. So, you can attach an additional two network interfaces to the instance. Because each task that uses the awsvpc network mode requires a network interface, you can typically only run two such tasks on this instance type. This can lead to under-utilization of your cluster capacity. If you enable elastic network interface trunking, you can increase the network interface density to place a larger number of tasks on each instance. With trunking turned on, a c5.large instance can have up to 12 network interfaces. The instance has the primary network interface and Amazon ECS creates and attaches a "trunk" network interface to the instance. As a result, with this configuration you can run 10 tasks on the instance instead of the default two tasks. For more information, see Increasing Amazon ECS Linux container instance network interfaces.

Elastic network interface API throttling

When you run tasks that use the awsvpc network mode, Amazon ECS relies on the following Amazon EC2 APIs. Each of these APIs have different API throttles. For more information, see Request throttling for the Amazon EC2 API in the Amazon EC2 API Reference.

  • CreateNetworkInterface

  • AttachNetworkInterface

  • DetachNetworkInterface

  • DeleteNetworkInterface

  • DescribeNetworkInterfaces

  • DescribeVpcs

  • DescribeSubnets

  • DescribeSecurityGroups

  • DescribeInstances

If the Amazon EC2 API calls are throttled during the elastic network interface provisioning workflows, the Amazon ECS service scheduler automatically retries with exponential back-offs. These automatic retires might sometimes lead to a delay in launching tasks, which results in slower deployment speeds. When API throttling occurs, you will see the message Operations are being throttled. Will try again later. on your service event messages. If you consistently meet Amazon EC2 API throttles, you can contact Amazon Web Services Support for guidance on how to increase your API throttling limits. For more information about monitoring and troubleshooting throttling errors, see Handling throttling issues.

Amazon Cloud Map

Amazon ECS service discovery and Service Connect use Amazon Cloud Map APIs to manage namespaces for your Amazon ECS services. If your services have a large number of tasks, consider the following recommendations.

Amazon Cloud Map service quotas

When Amazon ECS services are configured to use service discovery or Service Connect, the Tasks per service quota which is the maximum number of tasks for the service, is affected by the Amazon Cloud Map Instances per service service quota which is the maximum number of instances for that service. In particular, the Amazon Cloud Map service quota decreases the amount of tasks that you can run to at most 1,0000 tasks for service. You cannot change the Amazon Cloud Map quota. For more information, see Amazon Cloud Map service quotas.

Amazon Cloud Map API throttling

Amazon ECS calls the ListInstances, GetInstancesHealthStatus, RegisterInstance, and DeregisterInstance Amazon Cloud Map APIs on your behalf. They help with service discovery and perform health checks when you launch a task. When multiple services that use service discovery with a large number of tasks are deployed at the same time, this can result in exceeding the Amazon Cloud Map API throttling limits. When this happens, you likely will see the following message: Operations are being throttled. Will try again later in your Amazon ECS service event messages and slower deployment and task launch speed. Amazon Cloud Map doesn't document throttling limits for these APIs. If you experience throttling from these, you can contact Amazon Web Services Support for guidance on increasing your API throttling limits. For more recommendations about monitoring and troubleshooting such throttling errors, see Handle Amazon ECS throttling issues.