Zonal autoshift in Amazon Route 53 Application Recovery Controller - Amazon Route 53 Application Recovery Controller
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Zonal autoshift in Amazon Route 53 Application Recovery Controller

With zonal autoshift, you authorize Amazon to shift away resource traffic for an application from an Availability Zone during events, on your behalf, to help reduce time to recovery. Amazon starts an autoshift when internal telemetry indicates that there is an Availability Zone impairment that could potentially impact customers. When Amazon starts an autoshift, application traffic to resources that you've configured for zonal autoshift starts shifting away from the Availability Zone.

Be aware that Route 53 ARC does not inspect the health of individual resources. Amazon starts an autoshift when Amazon telemetry detects that there is an Availability Zone impairment that could potentially impact customers. In some cases, traffic might be shifted away for resources that are not experiencing impact.

With zonal autoshift, you also authorize Amazon to shift away resource traffic for an application from an Availability Zone, on your behalf, for regular practice runs. Practice runs are required for zonal autoshift. The zonal shifts that Route 53 ARC starts for practice runs help you to ensure that shifting away traffic from an Availability Zone during an autoshift is safe for your application. Practice runs regularly test that your application can operate normally without one Availability Zone by starting zonal shifts that shift traffic for a resource away from an Availability Zone. Practice runs take place weekly, and provide an outcome—such as SUCCEEDED or FAILED—to help you understand if the application operates as expected.

Important

Before you configure practice runs or enable zonal autoshift, we strongly recommend that you pre-scale your application resource capacity in all Availability Zones in the Region where your application resources are deployed. You should not rely on scaling on demand when an autoshift or practice run starts. Zonal autoshift, including practice runs, works independently, and does not wait for auto scaling actions to complete. Relying on auto scaling, instead of pre-scaling, can result in it taking longer for your application to recover.

If you use auto scaling to handle regular cycles of traffic, we strongly recommend that you configure the minimum capacity of your auto scaling to continue operating normally with the loss of an Availability Zone.

If you plan to enable zonal autoshift or configure practice runs, after you pre-scale your application resource capacity, test that your application can operate normally without one Availability Zone. To test this, start a zonal shift to move traffic for a resource away from an Availability Zone.

To ensure your tests with zonal shift are effective, it's important to validate that traffic drains as expected from the AZ you shift away from. Both Application Load Balancers and Network Load Balancers provide per AZ metrics in Amazon CloudWatch that you can use to monitor this. Depending on how long a service and clients reuse connections, traffic might continue to the AZ that you have shifted away from for longer than you expect. To learn more, see Limit the time that clients stay connected to your endpoints.

After you verify, by starting and evaluating a zonal shift, that your application can continue operating normally with traffic shifted away from an Availability Zone, the regular practice runs that Route 53 ARC performs help you to confirm, on an ongoing basis, that you have enough capacity for an autoshift.

In addition to enabling zonal autoshift for a load balancer resource in the Route 53 ARC console, you have the option to instead enable zonal autoshift for a specific load balancer in the Amazon EC2 console. To learn more about enabling zonal autoshift with Elastic Load Balancing, see Zonal shift in the Elastic Load Balancing User Guide.

Autoshifts and practice run zonal shifts are temporary. With autoshifts, when the affected Availability Zone recovers, Amazon stops shifting traffic for resources away from the Availability Zone. Application traffic for customers returns to all Availability Zones in the Region. With a practice run, traffic is shifted away from an Availability Zone for a single resource for about 30 minutes, and then shifted back to all Availability Zones in the Region.

You can configure Amazon EventBridge notifications to alert you about autoshifts and practice runs. For more information, see Using Route 53 ARC with Amazon EventBridge.

About zonal autoshift

Zonal autoshift is a capability where Amazon shifts application resource traffic away from an Availability Zone, on your behalf. Amazon starts an autoshift when internal telemetry indicates that there is an Availability Zone impairment that could potentially impact customers. The internal telemetry incorporates metrics from several sources, including the Amazon network, and the Amazon EC2 and Elastic Load Balancing services.

You can enable zonal autoshift for Network Load Balancers and Application Load Balancers with cross-zone load balancing turned off.

When you deploy and run Amazon applications on load balancers in multiple (typically three) AZs in a Region, and you pre-scale to support static stability, Amazon can quickly recover customer applications in an AZ by shifting traffic away with an autoshift. By shifting away resource traffic to other AZs in the Region, Amazon can reduce the duration and severity of potential impact caused by power outages, hardware or software issues in an AZ, or other impairments.

When Amazon begins an autoshift for a load balancing resource, Route 53 ARC sets Amazon Route 53 health checks to unhealthy for the corresponding IP addresses for the load balancer resource, so that traffic for the resource is no longer directed to the AZ. When Amazon determines that the AZ is ready for application traffic to return, Route 53 ARC restores the Route 53 health checks, and the original zonal IP addresses are restored.

When you enable zonal autoshift for a resource, you must also configure a practice run for the resource. Amazon performs practice runs about weekly, for 30 minutes, to help you make sure that you have enough capacity to run your application without one of the Availability Zones in the Region.

As with zonal shift, there are a few specific scenarios where zonal autoshift does not shift traffic away from the AZ. For example, if the load balancer target groups in the AZs don't have any instances, or if all of the instances are unhealthy, then the load balancer is in a fail open state and you can't shift away one of the AZs.

To learn more about zonal autoshift, see Zonal autoshift in Amazon Route 53 Application Recovery Controller.