Simulating a cluster network failure - SAP NetWeaver on Amazon
Simulating a cluster network failure

Description – simulate a network failure to test the cluster behavior in case of a split brain.

Run node – can be run on any node (this test case uses the secondary node).

Run steps

[root@hahost02 ~]# iptables -A INPUT -s x.x.x.x -j DROP; iptables -A OUTPUT -d x.x.x.x -j DROP

Note:x.x.x.x is the IP address of hahost01.

Expected result – the cluster detects the network failure and fences one of the nodes to avoid a split brain situation.

[root@hahost02 ~]# pcs status Cluster name: rhelha Stack: corosync Current DC: hahost02 (version 1.1.19-8.el7_6.5-c3c624ea3d) - partition with quorum Last updated: Wed May 12 21:07:32 2021 Last change: Wed May 12 20:37:27 2021 by root via crm_resource on hahost02 2 nodes configured 7 resources configured Online: [ hahost02 ] OFFLINE: [ hahost01 ] Full list of resources: clusterfence (stonith:fence_aws): Started hahost02 Resource Group: rsc_ASCS00_group rsc_fs_ascs00 (ocf::heartbeat:Filesystem): Started hahost02 rsc_vip_ascs00 (ocf::heartbeat:aws-vpc-move-ip): Started hahost02 rsc_ascs00 (ocf::heartbeat:SAPInstance): Started hahost02 Resource Group: rsc_ERS10_group rsc_fs_ers10 (ocf::heartbeat:Filesystem): Started hahost02 rsc_vip_ers10 (ocf::heartbeat:aws-vpc-move-ip): Started hahost02 rsc_ers10 (ocf::heartbeat:SAPInstance): Started hahost02 Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled

Recovery procedure

  • Log in to the Amazon Web Services Management Console and start the fenced node.

  • The cluster will move the ERS to the fenced node when the instance is up.