How Multi-Region Replication works in Amazon Keyspaces - Amazon Keyspaces (for Apache Cassandra)
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

How Multi-Region Replication works in Amazon Keyspaces

This section provides an overview of how Amazon Keyspaces Multi-Region Replication works. For more information about pricing, see Amazon Keyspaces (for Apache Cassandra) pricing.

How Multi-Region Replication works in Amazon Keyspaces

Amazon Keyspaces Multi-Region Replication implements a data resiliency architecture that distributes your data across independent and geographically distributed Amazon Web Services Regions. It uses active-active replication, which provides local low latency with each Region being able to perform reads and writes in isolation.

When you create an Amazon Keyspaces multi-Region keyspace, you can select up to five additional Regions where the data is going to be replicated to. Each table you create in a multi-Region keyspace consists of multiple replica tables (one per Region) that Amazon Keyspaces considers as a single unit.

Every replica has the same table name and the same primary key schema. When an application writes data to a local table in one Region, the data is durably written using the LOCAL_QUORUM consistency level. Amazon Keyspaces automatically replicates the data asynchronously to the other replication Regions. The replication lag across Regions is typically less than one second and doesn't impact your application’s performance or throughput.

After the data is written, you can read it from the multi-Region table in another replication Region with the LOCAL_ONE/LOCAL_QUORUM consistency levels. For more information about supported configurations and features, see Amazon Keyspaces Multi-Region Replication usage notes.

Multi-Region Replication conflict resolution

Amazon Keyspaces Multi-Region Replication is fully managed, which means that you don't have to perform replication tasks such as regularly running repair operations to clean-up data synchronization issues. Amazon Keyspaces monitors data consistency between tables in different Amazon Web Services Regions by detecting and repairing conflicts, and synchronizes replicas automatically.

Amazon Keyspaces uses the last writer wins method of data reconciliation. With this conflict resolution mechanism, all of the Regions in a multi-Region keyspace agree on the latest update and converge toward a state in which they all have identical data. The reconciliation process has no impact on application performance. To support conflict resolution, client-side timestamps are automatically turned on for multi-Region tables and can't be turned off. For more information, see Working with client-side timestamps in Amazon Keyspaces.

Multi-Region Replication disaster recovery

With Amazon Keyspaces Multi-Region Replication, both reads and writes are replicated asynchronously across each Region. In the rare event of a single Region degradation or failure, Multi-Region Replication helps you to recover from disaster with little to no impact to your application. Recovery from disaster is typically measured using values for Recovery time objective (RTO) and Recovery point objective (RPO).

Recovery time objective – The time it takes a system to return to a working state after a disaster. RTO measures the amount of downtime your workload can tolerate, measured in time. For disaster recovery plans that use Multi-Region Replication to fail over to an unaffected Region, the RTO can be nearly zero. The RTO is limited by how quickly your application can detect the failure condition and redirect traffic to another Region.

Recovery point objective – The amount of data that can be lost (measured in time). For disaster recovery plans that use Multi-Region Replication to fail over to an unaffected Region, the RPO is typically single-digit seconds. The RPO is limited by replication latency to the failover target replica.

In the event of a Regional failure or degradation, you don't need to promote a secondary Region or perform database failover procedures because replication in Amazon Keyspaces is active-active. Instead, you can use Amazon Route 53 to route your application to the nearest healthy Region. To learn more about Route 53, see What is Amazon Route 53?.

If a single Amazon Web Services Region becomes isolated or degraded, your application can redirect traffic to a different Region using Route 53 to perform reads and writes against a different replica table. You can also apply custom business logic to determine when to redirect requests to other Regions. An example of this is making your application aware of the multiple endpoints that are available.

When the Region comes back online, Amazon Keyspaces resumes propagating any pending writes from that Region to the replica tables in other Regions. It also resumes propagating writes from other replica tables to the Region that is now back online.

IAM permissions required to create multi-Region keyspaces and tables

To successfully create multi-Region keyspaces and tables, the IAM principal needs to be able to create a service-linked role. This service-linked role is a unique type of IAM role that is predefined by Amazon Keyspaces. It includes all the permissions that Amazon Keyspaces requires to perform actions on your behalf. For more information about the service-linked role, see Using roles for Amazon Keyspaces Multi-Region Replication.

To create the service-linked role required by Multi-Region Replication, the policy for the IAM principal requires the following elements:

  • iam:CreateServiceLinkedRole – The action the principal can perform.

  • arn:aws:iam::*:role/aws-service-role/replication.cassandra.amazonaws.com/AWSServiceRoleForKeyspacesReplication – The resource that the action can be performed on.

  • iam:AWSServiceName": "replication.cassandra.amazonaws.com – The only Amazon service that this role can be attached to is Amazon Keyspaces.

The following is an example of the policy that grants the minimum required permissions to a principal to create multi-Region keyspaces and tables.

{ "Effect": "Allow", "Action": "iam:CreateServiceLinkedRole", "Resource": "arn:aws:iam::*:role/aws-service-role/replication.cassandra.amazonaws.com/AWSServiceRoleForKeyspacesReplication", "Condition": {"StringLike": {"iam:AWSServiceName": "replication.cassandra.amazonaws.com"}} }

For additional IAM permissions for multi-Region keyspaces and tables, see the Actions, resources, and condition keys for Amazon Keyspaces (for Apache Cassandra) in the Service Authorization Reference.

Multi-Region Replication and integration with point-in-time recovery (PITR)

Point-in-time recovery is supported in multi-Region tables. To successfully restore a multi-Region table with PITR, the following conditions have to be met.

  • The source and the target table must be configured as multi-Region tables.

  • The replication Regions for the keyspace of the source table and for the keyspace of the target table must be the same.

You can run the restore statement from any of the Regions that the source table is available in. Amazon Keyspaces automatically restores the target table in each Region. For more information about PITR, see How point-in-time recovery works in Amazon Keyspaces.

Multi-Region Replication and integration with Amazon services

You can monitor replication performance between tables in different Amazon Web Services Regions by using Amazon CloudWatch metrics. The following metric provides continuous monitoring of multi-Region keyspaces.

  • ReplicationLatency – This metric measures the time it took to replicate updates, inserts, or deletes from one replica table to another replica table in a multi-Region keyspace.

For more information about how to monitor CloudWatch metrics, see Monitoring Amazon Keyspaces with Amazon CloudWatch.