Simulating a cluster network failure - SAP NetWeaver on Amazon
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Simulating a cluster network failure

Description – simulate a network failure to test the cluster behavior in case of a split brain.

Run node – can be run on any node (this test case uses the secondary node).

Run steps

[root@hahost02 ~]# iptables -A INPUT -s x.x.x.x -j DROP; iptables -A OUTPUT -d x.x.x.x -j DROP

Note:x.x.x.x is the IP address of hahost01.

Expected result – the cluster detects the network failure and fences one of the nodes to avoid a split brain situation.

[root@hahost02 ~]# pcs status Cluster name: rhelha Stack: corosync Current DC: hahost02 (version 1.1.19-8.el7_6.5-c3c624ea3d) - partition with quorum Last updated: Wed May 12 21:07:32 2021 Last change: Wed May 12 20:37:27 2021 by root via crm_resource on hahost02 2 nodes configured 7 resources configured Online: [ hahost02 ] OFFLINE: [ hahost01 ] Full list of resources: clusterfence (stonith:fence_aws): Started hahost02 Resource Group: rsc_ASCS00_group rsc_fs_ascs00 (ocf::heartbeat:Filesystem): Started hahost02 rsc_vip_ascs00 (ocf::heartbeat:aws-vpc-move-ip): Started hahost02 rsc_ascs00 (ocf::heartbeat:SAPInstance): Started hahost02 Resource Group: rsc_ERS10_group rsc_fs_ers10 (ocf::heartbeat:Filesystem): Started hahost02 rsc_vip_ers10 (ocf::heartbeat:aws-vpc-move-ip): Started hahost02 rsc_ers10 (ocf::heartbeat:SAPInstance): Started hahost02 Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled

Recovery procedure

  • Log in to the Amazon Web Services Management Console and start the fenced node.

  • The cluster will move the ERS to the fenced node when the instance is up.