Restore testing
Restore testing, a feature offered by Amazon Backup, provides automated and periodic evaluation of restore viability, as well as the ability to monitor restore job duration times.
Contents
- Overview
- Restore testing compared with restore process
- Restore testing management
- Create a restore testing plan
- Update a restore testing plan
- View existing restore testing plans
- View restore testing jobs
- Delete a restore testing plan
- Audit restore testing
- Restore testing quotas and parameters
- Restore testing failure troubleshooting
- Restore testing inferred metadata
- Restore testing validation
Overview
First, you create a restore testing plan where you provide a name for your plan, the frequency for your restore tests, and the target start time. Then, you assign the resources you want to include in your plan. You then choose to include specific or random recovery points in your test. Amazon Backup backup intelligently infers the metadata that will be needed for your restore job to be successful.
When the scheduled time in your plan arrives, Amazon Backup starts restore jobs based on your plan and monitors the time taken to complete the restore.
After the restore test plan completes its run, you can use the results to show compliance for organizational or governance requirements such as the successful completion of restore test scenarios or the restore job completion time.
Optionally, you can use Restore testing validation to confirm the restore test results.
Once the optional validation completes or the validation window closes, Amazon Backup deletes the resources involved with the restore test, and the resources will be deleted in accordance with service SLAs.
At the end of the testing process, you can view the results and the completion time of the tests.
Restore testing compared with restore process
Restore testing runs restore jobs in the same way as on-demand restores and uses the
same recovery points (backups) as an on-demand restore. You will see calls to
StartRestoreJob
in CloudTrail (if opted-in) for each job started by restore
testing
However, there are a few differences between the operation of a schedule restore test and an on-demand restore operation:
Restore Testing | Restore | |
---|---|---|
Account |
Recommended best practice is to designate an account to be used for restore tests |
You can restore resources from an account |
Amazon Backup Audit Manager |
Can turn on a control to confirm if a restore test meets specified restore objectives |
|
Cadence |
Periodically as part of a scheduled plan. |
On demand |
Resources |
The resource types you can assign to your testing plan include: Aurora, Amazon DocumentDB, Amazon DynamoDB, Amazon EBS, Amazon EC2, Amazon EFS, Amazon FSx (Lustre, ONTAP, OpenZFS, Windows), Amazon Neptune, Amazon RDS, and Amazon S3. |
All resources can be restored. |
Results |
After the restore testing job is completed, the restored resource is deleted after the Restore testing validation window finishes. |
Once the restore job is completed, the restored version of the resource remains. |
Tags |
For resource types which support tag on restore, testing applies tags on restore. |
Tags are optional for supported resources. |
Restore testing management
You can create, view, update, or delete a restore testing plan in the Amazon Backup console
You can use Amazon CLIaws backup
.
Data deletion
When a restore test is finished, Amazon Backup begins deleting the resources involved in the test. This deletion is not instantaneous. Each resource has an underlying configuration that determines how those resources are stored and lifecycled. For example, if Amazon S3 buckets are part of the restore test, lifecycle rules are added to the bucket. It can take up to several days for the rules to execute and for the bucket and its objects to be fully deleted, but charges will only occur for these resources until the day when the lifecycle rule initiates (by default this is 1 days). Speed of deletion will depend upon the resource type.
Resources that are part of a restore testing plan contain a tag called
awsbackup-restore-test
. If a user removes this tag, Amazon Backup cannot delete the
resource at the end of the testing period and the user will have to delete it manually
instead.
To check why resources may not have been deleted as expected, you can search through
failed jobs in the console or use the command line interface to call the API request
DescribeRestoreJob
to retrieve deletion status messages.
Backup plans (non-restore testing plans) ignore resources created by restore testing
(those with tag awsbackup-restore-test
or a name starting with
awsbackup-restore-test
).
Cost control
Restore testing has a cost per restore test. Depending on what resources are included
in your restore testing plan, the restore jobs that are part of the plan may also have a
cost. See Amazon Backup Pricing
When you set up a restore testing plan for the first time, you may find it beneficial to include a minimum number of resource types and protected resources to familiarize yourself with the feature, the process, and the average costs involved. You can update a plan after its creation to add more resource types and protected resources.
Create a restore testing plan
A restore testing plan has two parts: plan creation and assigning resources.
When you use the console, these parts are sequential. In the first part, you set the name, frequency, and start times. During the second part you assign resources to your testing plan.
When using Amazon CLI and API, first use create-restore-testing-plan
. After you receive a successful
response and the plan has been created, then use create-restore-testing-selection
, for each resource type
to include in your plan.
When you create a restore testing plan, we create a service-linked role for you. For more information, see Using roles for restore testing.
Recovery point determination
Each time a testing plan runs (according to the frequency and start time you specified), one eligible recovery point per protected resource in selection is restored by the restore test. If no recovery points for a resource meet the recovery point selection criteria, that resource will not be included in the test.
A recovery point for a protected resource in a testing selection is eligible if meets the criteria for the specified time frame and included vaults in the restore testing plan.
A protected resource is selected if the resource testing selection includes the resource type and if either of the following conditions are true:
-
The resource ARN is specified in that selection; or,
-
The tag conditions on that selection match the tags on the latest recovery point for the resource
Update a restore testing plan
You can update parts of your restore testing plan and the resource selections within it through the console or Amazon CLI.
View existing restore testing plans
View restore testing jobs
Delete a restore testing plan
Audit restore testing
Restore testing integrations with Amazon Backup Audit manager to help you evaluate if a restored resource completed within your target restore time.
For more information, see Restore time for resources meet target control in Amazon Backup Audit Manager controls and remediation.
Restore testing quotas and parameters
-
100 restore testing plans
-
50 tags can be added to each restore testing plan
-
30 selections per plan
-
30 protected resource ARNs per selection
-
30 protected resource conditions per selection (including those within both
StringEquals
andStringNotEquals
) -
30 vault selectors per selection
-
Max selection window days: 365 days
-
Start window hours: Min: 1 hour; Max: 168 hours (7 days)
-
Max plan name length: 50 characters
-
Max selection name length: 50 characters
Additional information regarding limits can be viewed at Amazon Backup quotas.
Restore testing failure troubleshooting
If you have restore testing jobs with a restore status of Failed
, the
following reasons can help you determine the cause and remedy.
Error message(s) can be
viewed in the Amazon Backup console in the job status details page or by using the
CLI commands list-restore-jobs-by-protected-resource
or
list-restore-jobs
.
-
Error:
No default VPC for this user.
GroupName
is only supported for EC2-Classic and default VPC.Solution 1: Update your restore testing selection and override the parameter
SubnetId
. The Amazon Backup console displays this parameter as "Subnet".Solution 2: Recreate the default VPC.
Resource types affected: Amazon EC2
-
Error:
No subnets found for the default VPC [vpc]. Please specify a subnet.
Solution 1: Update your restore testing selection and override the
SubnetId
restore parameter. The Amazon Backup console displays this parameter as "Subnet".Solution 2: Create a default subnet in the default VPC.
Resource types affected: Amazon EC2
-
Error:
No default subnet detected in VPC. Please contact Amazon Web Services Support to recreate default Subnets.
Solution 1: Update your restore testing selection and override the
DBSubnetGroupName
restore parameter. The Amazon Backup console displays this parameter as Subnet group.Solution 2: Create a default subnet in the default VPC.
Resource types affected: Amazon Aurora, Amazon DocumentDB, Amazon RDS, Neptune
-
Error:
IAM Role cannot be assumed by Amazon Backup
.Solution: The restore role must be assumable by Amazon Backup. Either update the role's trust policy in IAM to allow it to be assumed by
"backup.amazonaws.com"
or update your restore testing selection to use a role that is assumable by Amazon Backup.Resource types affected: all
-
Error:
Access denied to KMS key.
orThe specified Amazon KMS key ARN does not exist, is not enabled or you do not have permissions to access it.
Solution: Verify the following:
-
The restore role has access to the Amazon KMS key used to encrypt your backups and, if applicable, the KMS key used to encrypt the restored resource.
-
The resource policies on the above KMS key(s) allow the restore role to access them.
If the above conditions are not yet met, configure the restore role and the resource policies for appropriate access. Then, run the restore testing job again.
Resource types affected: all
-
-
Errors:
User
orARN
is not authorized to performaction
onresource
because no identity based policy allows theaction
.Access denied performing
.s3:CreateBucket
onawsbackup-restore-test-xxxxxx
Solution: The restore role does not have adequate permissions. Update the permissions in IAM for the restore role.
Resource types affected: all
-
Errors:
User
orARN
is not authorized to performaction
onresource
because no resource-based policy allows theaction
.User
ARN
is not authorized to performaction
onresource
with an explicit deny in a resource based policy.Solution: The restore role does not have adequate access to the resource specified in the message. Update the resource policy on the resource mentioned.
Resource types affected: all