Configure simplified automatic recovery on an Amazon EC2 instance
Important
This section describes how to proactively configure recovery mechanisms on an EC2 instance. These recovery mechanisms are designed to restore instance availability when Amazon detects an underlying hardware or software issue that causes a system status check to fail. If you are currently experiencing problems accessing your instance, see Troubleshoot EC2 instances.
If Amazon detects that an instance is unavailable due to an underlying hardware or software issue, simplified automatic recovery can automatically restore instance availability by moving the instance from the host with the underlying issue to a different host.
If simplified automatic recovery occurs, Amazon sends one of the following events to your Amazon Health Dashboard, depending on the outcome:
-
Success event:
AWS_EC2_SIMPLIFIED_AUTO_RECOVERY_SUCCESS
-
Failure event:
AWS_EC2_SIMPLIFIED_AUTO_RECOVERY_FAILURE
To be notified of these events, you can configure notifications. For more information, see Creating your first notification configuration in Amazon User Notifications in the Amazon User Notifications User Guide. You can also use Amazon EventBridge rules to monitor simplified automatic recovery events.
Simplified automatic recovery is enabled by default on all supported instances during
instance launch. However, it can only operate if an instance is in the running
state, there are no service events listed in the Amazon Health Dashboard, and there is available capacity
for the instance type. In some situations, such as significant outages, capacity constraints
might cause recovery attempts to fail. For more information, see Troubleshoot
simplified automatic recovery failures.
You can disable simplified automatic recovery during or after launch, and re-enable it later if required.
Warning
When Amazon recovers your instance due to an underlying hardware or software issue, be aware of the following consequences: data stored in volatile memory (RAM) will be lost and the operating system’s uptime will start over from zero. To help protect against data loss, we recommend that you regularly create backups of valuable data. For more information about backup and recovery best practices for EC2 instances, see Best practices for Amazon EC2.
Automatic instance recovery mechanisms are designed for individual instances. For guidance on building a resilient system, see Build a resilient system.
Topics
Requirements for enabling simplified automatic recovery
Simplified automatic recovery can be enabled on instances that meet the following criteria:
- Instance types
-
-
General purpose: A1 | M3 | M4 | M5 | M5a | M5n | M5zn | M6a | M6g | M6i | M6in | M7a | M7g | M7i | M7i-flex | M8g | T1 | T2 | T3 | T3a | T4g
-
Compute optimized: C3 | C4 | C5 | C5a | C5n | C6a | C6g | C6gn | C6i | C6in | C7a | C7g | C7gn | C7i | C7i-flex | C8g
-
Memory optimized: R3 | R4 | R5 | R5a | R5b | R5n | R6a | R6g | R6i | R6in | R7a | R7g | R7i | R7iz | R8g | u-3tb1 | u-6tb1 | u-9tb1 | u-12tb1 | u-18tb1 | u-24tb1 | u7i-6tb | u7i-8tb | u7i-12tb | u7in-16tb | u7in-24tb | u7in-32tb | u7inh-32tb | X1 | X1e | X2iezn | X8g
-
Accelerated computing: G3 | G3s | G5g | Inf1 | P2 | P3 | VT1
-
High-performance computing: Hpc6a | Hpc7a | Hpc7g
-
- Tenancy
-
-
Shared
-
Dedicated Instance
For more information, see Amazon EC2 Dedicated Instances.
-
Limitations
Simplified automatic recovery is not supported for instances with the following characteristics:
-
Instance size:
metal
instances -
Tenancy: Dedicated Host. For Dedicated Hosts, use Dedicated Host Auto Recovery instead.
-
Storage: Instances with instance store volumes
-
Networking: Instances using an Elastic Fabric Adapter
-
Auto Scaling: Instances that are part of an Auto Scaling group
-
Maintenance: Instances currently undergoing a scheduled maintenance event
Configure simplified automatic recovery
Simplified automatic recovery is enabled by default when you launch a supported
instance. You can set the automatic recovery behavior to disabled
during or
after launching the instance.
The default
configuration doesn't enable simplified automatic recovery for
an unsupported instance.
Troubleshoot simplified automatic recovery failures
If simplified automatic recovery fails to recover your instance, consider the following issues:
-
Amazon service events are running
Simplified automatic recovery does not operate during service events in the Amazon Health Dashboard. You might not receive recovery failure notifications for such events. For the latest service availability information, see the Service health
status page. -
Insufficient capacity
There is temporarily insufficient replacement hardware to migrate the instance.
-
Maximum daily recovery attempts reached
The instance has reached the maximum daily allowance for recovery attempts. Your instance might subsequently be retired if automatic recovery fails and a hardware degradation is determined to be the root cause of the original failed system status check.
If the instance’s system status check failure persists despite multiple recovery attempts, see Troubleshoot instances with failed status checks for additional guidance.