Tutorial: Set up high availability for Amazon Amazon IoT Amazon IoT Greengrass V2 with Pacemaker
This tutorial shows you how to set up Amazon Amazon IoT Amazon IoT Greengrass V2 in a high availability (HA)
configuration using Pacemaker
Important
This tutorial uses Amazon Elastic Compute Cloud instances to demonstrate the setup. You can deploy the Amazon IoT Greengrass V2 and Pacemaker integration to achieve high availability on a cluster of any device type, as long as the devices can communicate with one another.
This tutorial includes the following setups:
-
Active/Passive Amazon IoT Greengrass V2 service – Run Amazon IoT Greengrass V2 as a systemd service managed by Pacemaker with DRBD-replicated storage. Only one instance runs Amazon IoT Greengrass V2 at a time, and Pacemaker handles failover to a standby instance.
-
Active/Passive load balancer – Run HAProxy as a Pacemaker-managed resource with its configuration stored on DRBD-replicated storage. Pacemaker fails over the load balancer to a standby instance if the primary goes down.
-
Active/Active Amazon IoT Greengrass V2 component – Monitor a Amazon IoT Greengrass V2 component across all instances using a custom OCF (Open Cluster Framework) resource agent. Pacemaker detects component failures and triggers recovery without full instance failover.
Each setup is standalone and mutually exclusive. Each setup assumes a fresh start from the prerequisites, with the single DRBD resource repurposed for each setup's needs. Setup 3 (Active/Active) does not use DRBD — skip the DRBD prerequisite steps and install Amazon IoT Greengrass V2 to a local path on each instance instead.
In Setups 1 and 2, you create a highly available cluster of Amazon IoT Greengrass V2 devices. The cluster contains a primary instance, which is the instance that is currently active and running the managed services (such as Amazon IoT Greengrass V2 or HAProxy), and one or more standby instances, which are idle and waiting to take over if the primary fails. Pacemaker automatically promotes one of the standby instances to primary during failover. In Setup 3 (Active/Active), all instances run the service simultaneously, and Pacemaker handles per-instance recovery rather than failover promotion.