Enable transaction logs - Amazon Data Firehose
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Firehose supports database as a source in all Amazon Web Services Regions except China Regions, Amazon GovCloud (US) Regions, and Asia Pacific (Malaysia). This feature is in preview and is subject to change. Do not use it for your production workloads.

Enable transaction logs

The transaction logs record all database changes such as INSERT, UPDATE and DELETE in the order it is committed to the database. Firehose reads the transaction logs and replicates the changes to Apache Iceberg Tables. You must enable the transaction logs if you haven't already. The following sections show how you can enable transaction logs for various MySQL and PostgreSQL databases.

MySQL

Self-managed MySQL running on EC2
  • Check whether the log-bin option is enabled:

    mysql> SELECT variable_value as "BINARY LOGGING STATUS (log-bin) ::" FROM performance_schema.global_variables WHERE variable_name='log_bin';
  • For Databases running on EC2, If the binlog is OFF, add the properties in the following table to the configuration file for the MySQL server. For more information on how to set the parameters, see MySQL documentation on binlog.

    server-id = 223344 # Querying variable is called server_id, e.g. SELECT variable_value FROM information_schema.global_variables WHERE variable_name='server_id'; log_bin = mysql-bin binlog_format = ROW binlog_row_image = FULL binlog_expire_logs_seconds = 864000
RDS MySQL
  • If binary logging is not enabled, then enable it with the steps outlined in Configuring RDS for MySQL binary logging.

  • Set the MySQL binary logging format to ROW format.

  • Set the binlog retention period at least to 72 hours. To increase the retention period of binlog, refer to RDS documentation. By default, the retention period is NULL, so you must set the retention period to a non-zero value.

Aurora MySQL
  • If binary logging, is not enabled, then enable it for Aurora MySQL with the steps in configuring Aurora for MySQL binary logging.

  • Set the MySQL binary logging format to ROW format.

  • Set the binlog retention period at least to 72 hours. To increase the retention period of binlog, refer to Setting and showing binary log configuration. By default, the retention period is NULL, so you must set the retention period to a non-zero value.

PostgreSQL

Self-managed PostgreSQL running on EC2
  • The above script for self-managed PostgreSQL sets the wal_level to logical.

  • Configure additional WAL retention settings in postgresql.conf

    • PostgreSQL 12 – wal_keep_segments = <int>

    • PostgreSQL 13+ – wal_keep_size = <int>

RDS and Aurora PostgreSQL
  • You must enable the Logical replication (Write-ahead-Logging) through RDS along with WAL retention settings. For more information, see Logical decoding on a read replica.