测试 7:模拟群集网络故障 - 上的 SAP HANAAmazon
Amazon Web Services 文档中描述的 Amazon Web Services 服务或功能可能因区域而异。要查看适用于中国区域的差异,请参阅中国的 Amazon Web Services 服务入门

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

测试 7:模拟群集网络故障

说明— 模拟网络故障,以测试群集行为在大脑分裂的情况下。

运行节点— 可以在任何节点上运行。在这个测试用例中,这是在节点 B 上完成的。

运行步骤

  • 使用以下命令删除来自节点 A 的所有流量:

    iptables -A INPUT -s <<Primary IP address of Node A>> -j DROP; iptables -A OUTPUT -d <<Primary IP address of Node A>> -j DROP sechana:~ # crm status Stack: corosync Current DC: prihana (version 1.1.18+20180430.b12c320f5-3.24.1-b12c320f5) - partition with quorum Last updated: Fri Jan 22 02:16:28 2021 Last change: Fri Jan 22 02:16:27 2021 by root via crm_attribute on sechana 2 nodes configured 6 resources configured Online: [ prihana sechana ] Full list of resources: res_AWS_STONITH (stonith:external/ec2): Started prihana res_AWS_IP (ocf::suse:aws-vpc-move-ip): Started sechana Clone Set: cln_SAPHanaTopology_HDB_HDB00 [rsc_SAPHanaTopology_HDB_HDB00] Started: [ prihana sechana ] Master/Slave Set: msl_SAPHana_HDB_HDB00 [rsc_SAPHana_HDB_HDB00] Masters: [ prihana ] Slaves: [ sechana ] sechana:~ # iptables -A INPUT -s 11.0.1.132 -j DROP; iptables -A OUTPUT -d 11.0.1.132 -j DROP

预期输出

  • 群集检测到网络故障并将节点 1 进行栅栏。它将辅助 SAP HANA 数据库(在节点 2 上)作为主数据库进行接管,而不会出现大脑分裂的情况。

    sechana:~ # crm status Stack: corosync Current DC: prihana (version 1.1.18+20180430.b12c320f5-3.24.1-b12c320f5) - partition with quorum Last updated: Fri Jan 22 17:08:09 2021 Last change: Fri Jan 22 17:07:46 2021 by root via crm_attribute on sechana 2 nodes configured 6 resources configured Online: [ prihana sechana ] Full list of resources: res_AWS_STONITH (stonith:external/ec2): Started prihana res_AWS_IP (ocf::suse:aws-vpc-move-ip): Started sechana Clone Set: cln_SAPHanaTopology_HDB_HDB00 [rsc_SAPHanaTopology_HDB_HDB00] rsc_SAPHanaTopology_HDB_HDB00 (ocf::suse:SAPHanaTopology): Started prihana (Monitoring) Started: [ sechana ] Master/Slave Set: msl_SAPHana_HDB_HDB00 [rsc_SAPHana_HDB_HDB00] Masters: [ sechana ] Stopped: [ prihana ] Failed Actions: * rsc_SAPHanaTopology_HDB_HDB00_monitor_10000 on prihana 'unknown error' (1): call=317, status=Timed Out, exitreason='', last-rc-change='Fri Jan 22 16:58:19 2021', queued=0ms, exec=300001ms * rsc_SAPHana_HDB_HDB00_start_0 on prihana 'unknown error' (1): call=28, status=Timed Out, exitreason='', last-rc-change='Fri Jan 22 02:40:38 2021', queued=0ms, exec=3600001ms

恢复程序

  • 清除群集”failed actions”。