Replicating existing objects with S3 Batch Replication - Amazon Simple Storage Service
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Replicating existing objects with S3 Batch Replication

S3 Batch Replication provides you a way to replicate objects that existed before a replication configuration was in place, objects that have previously been replicated, and objects that have failed replication. This is done through the use of a Batch Operations job. This differs from live replication which continuously and automatically replicates new objects across Amazon S3 buckets. To get started with Batch Replication you may:

  • Initiate Batch Replication for a new replication rule or destination – You may create a one-time Batch Replication job when creating the first rule in a new replication configuration or when adding a new destination to an existing configuration through the Amazon Web Services Management Console.

  • Initiate Batch Replication for an existing replication configuration – You can create a new Batch Replication job using S3 Batch Operations through the Amazon SDKs, Amazon Command Line Interface (Amazon CLI), or the Amazon S3 console.

When the Batch Replication job finishes, you receive a completion report. For more information about how to use the report to examine the job, see Tracking job status and completion reports.

S3 Batch Replication considerations

  • Your source bucket must have an existing replication configuration. To enable replication, see Setting up replication and Walkthroughs: Examples for configuring replication.

  • If you have S3 Lifecycle configured for your bucket, we recommend disabling your Lifecycle rules while the Batch Replication job is active. This will ensure parity between the source and destination buckets. Otherwise these buckets could diverge and the destination bucket will not be an exact replica of the source bucket. Consider the following:

    • Your source bucket has multiple versions on an object and a delete marker.

    • Your source and destination buckets have a lifecycle configuration to remove expired delete markers.

    Batch Replication may replicate the delete marker to the destination bucket before replicating the object versions. This could result in the delete marker being marked as expired and being removed from the destination bucket before the objects are copied.

  • The Amazon Identity and Access Management (IAM) role that you specify to run the Batch Operations job must have permissions to perform the underlying Batch Replication operation. For more information about creating IAM roles, see Configuring IAM policies for Batch Replication.

  • Batch Replication requires a manifest which can be generated by Amazon S3. The generated manifest must be stored in the same Amazon Web Services Region as the source bucket. If you choose to not generate the manifest you may supply an Amazon S3 Inventory report or CSV file that contains the objects you wish to replicate.

  • Batch Replication does not support re-replicating objects that were deleted with the version ID of the object from the destination bucket. To re-replicate these objects you can copy the source objects in place with a Batch Copy job. Copying those objects in place will create new versions of the object in the source bucket and initiate replication automatically to the destination. Deleting and recreating the destination bucket will not initiate replication.

    For more information on Batch Copy, see, Examples that use Batch Operations to copy objects.

  • If you’re using a replication rule on the S3 bucket, make sure to update your replication configuration, granting the IAM role attached to the replication rule, proper permissions to replicate objects. The IAM role must have permissions to perform the S3 action on both the source and destination bucket.

  • If you submit multiple Batch Replication jobs for the same bucket within a short time frame, S3 will run those jobs concurrently.

  • If you submit multiple Batch Replication jobs for two different buckets, be aware that S3 might not run all jobs concurrently. If you exceed the number of Batch Replication jobs that can run at one time on your account, S3 will pause the lower priority jobs to work on the higher priority ones. After the higher priority items have completed, any paused jobs will become active again.

  • Batch replication is not supported for objects stored in the S3 Glacier Flexible Retrieval and S3 Glacier Deep Archive storage classes.

  • To batch replicate S3 Intelligent-Tiering objects stored in the Archive Access or Deep Archive Access storage tier, you must first initiate a restore request and wait until the objects are moved to the Frequent Access tier.

Specifying a manifest for a Batch Replication job

A manifest is an Amazon S3 object that contains object keys that you want Amazon S3 to act upon. If you wish to create a Batch Replication job you must supply either a user-generated manifest or have Amazon S3 generate a manifest based on your replication configuration.

If you supply a user-generated manifest it must be in the form of a Amazon S3 Inventory report or CSV file. If the objects in your manifest are in a versioned bucket, you must specify the version IDs for the objects. Only the object with the version ID specified in the manifest will be replicated. To learn more about specifying a manifest, see Specifying a manifest.

If you choose to have Amazon S3 generate a manifest file on your behalf the objects listed will use the same source bucket, prefix, and tags as all your replication configurations of the source bucket. With a generated manifest Amazon S3 will replicate all eligible versions of your objects.

Note

If you choose to have the manifest generated it must be stored in the same Amazon Web Services Region as the source bucket.

Filters for a Batch Replication job

When creating your Batch Replication job you can optionally specify additional filters, such as object creation date and replication status to reduce the scope of the job.

You can filter objects to replicate based on the ObjectReplicationStatuses value, by providing one or more of the following values:

  • "NONE" – Indicates that Amazon S3 has never attempted to replicate the object before.

  • "FAILED" – Indicates that Amazon S3 has attempted, but failed to replicate the object before.

  • "COMPLETED" – Indicates that Amazon S3 has successfully replicated the object before.

  • "REPLICA" – Indicates that this is a replica object that Amazon S3 replicated from another source.

For more information about replication statuses, see Getting replication status information.

If you do not filter based on replication status Batch Operations will attempt to replicate everything eligible. Depending on your goal, you might set ObjectReplicationStatuses to one of the following values:

  • If you want to replicate only existing objects that have never been replicated, only include "NONE".

  • If you want to retry replicating only objects that previously failed to replicate, only include "FAILED".

  • If you want to both replicate existing objects and retry replicating objects that previously failed to replicate, include both "NONE" and "FAILED".

  • If you want to back-fill a destination bucket with objects that have been replicated to another destination, include "COMPLETED".

  • If you want replicate objects previously replicated, include "REPLICA".

Batch Replication completion report

When you create a Batch Replication job, you can request a CSV completion report. This report shows objects, replication success or failure codes, outputs, and descriptions. For more information about job tracking and completion reports see, Completion reports.

For a list of Replication failure codes and descriptions see, Amazon S3 replication failure reasons.

Getting started with Batch Replication

To learn more about how to use Batch Replication, see Tutorial: Replicating existing objects in your Amazon S3 buckets with S3 Batch Replication.