Best practices for Amazon RDS blue/green deployments - Amazon Relational Database Service
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Best practices for Amazon RDS blue/green deployments

The following are best practices for blue/green deployments.

General best practices for blue/green deployments

Consider the following general best practices when you create a blue/green deployment.

  • Thoroughly test the DB instances in the green environment before switching over.

  • Keep your databases in the green environment read only. We recommend that you enable write operations on the green environment with caution because they can result in replication conflicts. They can also result in unintended data in the production databases after switchover.

  • If you use a blue/green deployment to implement schema changes, make only replication-compatible changes.

    For example, you can add new columns at the end of a table without disrupting replication from the blue deployment to the green deployment. However, schema changes, such as renaming columns or renaming tables, break replication to the green deployment.

    For more information about replication-compatible changes, see Replication with Differing Table Definitions on Source and Replica in the MySQL documentation and Restrictions in the PostgreSQL logical replication documentation.

    Note

    This limitation doesn't apply to RDS for PostgreSQL blue/green deployments that use physical replication. For more information, see RDS for PostgreSQL limitations for blue/green deployments with physical replication.

  • After you create the blue/green deployment, handle lazy loading if necessary. Make sure data loading is complete before switching over. For more information, see Lazy loading and storage initialization for blue/green deployments.

  • When you switch over a blue/green deployment, follow the switchover best practices. For more information, see Switchover best practices.

RDS for MySQL best practices for blue/green deployments

Consider the following best practices when you create a blue/green deployment from an RDS for MySQL DB instance.

  • Avoid using non-transactional storage engines, such as MyISAM, that aren't optimized for replication.

  • Optimize read replicas and the green environment for binary log replication. If supported by your DB engine, enable GTID, parallel, and crash-safe replication to ensure data consistency and durability before you create your blue/green deployment. For more information, see Using GTID-based replication.

  • If the green environment experiences replica lag, consider the following:

    • Temporarily set the innodb_flush_log_at_trx_commit parameter to 2 in the green DB parameter group. After replication catches up, revert to the default value of 1 before switchover. If an unexpected shutdown or crash occurs with the temporary parameter value, rebuild the green environment to avoid undetected data corruption.

    • To reduce write latency and improve replication throughput, temporarily change green Multi-AZ DB instances to Single-AZ DB instances. Re-enable Multi-AZ right before switchover.

RDS for PostgreSQL best practices for blue/green deployments

Consider the following best practices when you create a blue/green deployment from an RDS for PostgreSQL DB instance.

RDS for PostgreSQL general best practices for blue/green deployments

Consider the following general best practices when you create a blue/green deployment from an RDS for PostgreSQL DB instance.

  • Update all of your PostgreSQL extensions to the latest version before you create a blue/green deployment. For more information, see Upgrading PostgreSQL extensions in RDS for PostgreSQL databases.

  • Long-running transactions can cause significant replica lag. To reduce replica lag, consider doing the following:

    • Reduce long-running transactions that can be delayed until after the green environment catches up to the blue environment.

    • Reduce bulk operations on the blue environment until after the green environment catches up to the blue environment.

    • Initiate a manual vacuum freeze operation on busy tables prior to creating the blue/green deployment.

    • For PostgreSQL version 12 and higher, disable the index_cleanup parameter on large or busy tables to increase the rate of normal maintenance on blue databases. For more information, see Vacuuming a table as quickly as possible.

      Note

      Regularly skipping index cleanup during vacuuming can lead to index bloat, which might degrade scan performance. As a best practice, use this approach only while using a blue/green deployment. Once the deployment is complete, we recommend resuming regular index maintenance and cleanup.

  • Slow replication can cause senders and receivers to restart often, which delays synchronization. To ensure that they remain active, disable timeouts by setting the wal_sender_timeout parameter to 0 in the blue environment, and the wal_receiver_timeout parameter to 0 in the green environment.

  • To prevent write-ahead log (WAL) segments from being removed from the blue environment, set the wal_keep_segments parameter to 15625 for PostgreSQL version 13 and lower. For version 14 and higher, set the wal_keep_size parameter too 1 TiB, if there's enough free storage space.

RDS for PostgreSQL best practices for blue/green deployments with physical replication

With physical replication, Amazon RDS creates a read replica of the source DB instance. For related parameters, monitoring, tuning, and troubleshooting, see Working with read replicas for Amazon RDS for PostgreSQL.

For an explanation of when blue/green deployments use physical replication instead of logical replication, see PostgreSQL replication methods for blue/green deployments.

RDS for PostgreSQL best practices for blue/green deployments with logical replication

Consider the following best practices when you create a blue/green deployment that uses logical replication. For an explanation of when blue/green deployments use logical replication instead of physical replication, see PostgreSQL replication methods for blue/green deployments.

  • If your database has sufficient freeable memory, increase the value of the logical_decoding_work_mem DB parameter in the blue environment. Doing so allows for less decoding on disk and instead uses memory. For more information, see the PostgreSQL documentation.

    • You can monitor transaction overflow being written to disk using the ReplicationSlotDiskUsage CloudWatch metric. This metric offers insights into the disk usage of replication slots, helping identify when transaction data exceeds memory capacity and is stored on disk. You can monitor freeable memory with the FreeableMemory CloudWatch metric. For more information, see Amazon CloudWatch instance-level metrics for Amazon RDS.

    • In RDS for PostgreSQL version 14 and higher, you can monitor the size of logical overflow files using the pg_stat_replication_slots system view.

  • If you’re using the aws_s3 extension, give the green DB instance access to Amazon S3 through an IAM role after the green environment is created. This allows the import and export commands to continue functioning after switchover. For instructions, see Setting up access to an Amazon S3 bucket.

  • Review the performance of your UPDATE and DELETE statements and evaluate whether creating an index on the column used in the WHERE clause can optimize these queries. This can enhance performance when the operations are replayed in the green environment.

  • If you're using triggers, make sure they don't interfere with the creating, updating, and dropping of pg_catalog.pg_publication, pg_catalog.pg_subscription, and pg_catalog.pg_replication_slots objects whose names start with 'rds'.

  • If you specify a higher engine version for the green environment, run the ANALYZE operation on all databases to refresh the pg_statistic table. Optimizer statistics aren't transferred during a major version upgrade, so you must regenerate all statistics to avoid performance issues. For additional best practices during major version upgrades, see How to perform a major version upgrade for RDS for PostgreSQL.

  • Avoid configuring triggers as ENABLE REPLICA or ENABLE ALWAYS if the trigger is used on the source to manipulate data. Otherwise, the replication system propagates changes and executes the trigger, which leads to duplication.