Crash the primary database on node 1 - SAP HANA on Amazon
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Crash the primary database on node 1

Description — Simulate a complete breakdown of the primary database system.

Run node: Primary SAP HANA database node

Run steps:

  • Crash the primary database system using the following command as <sid>adm:

    [root@prihana ~] sudo su - hdbadm
    hdbadm@prihana:/usr/sap/HDB/HDB00> HDB kill -9
    hdbenv.sh: Hostname prihana defined in $SAP_RETRIEVAL_PATH=/usr/sap/HDB/HDB00/
    prihana differs from host name defined on command line.
    hdbenv.sh: Error: Instance not found for host -9
    killing HDB processes:
    kill -9 6011 /usr/sap/HDB/HDB00/prihana/trace/hdb.sapHDB_HDB00 -d -nw -f
    /usr/sap/HDB/HDB00/prihana/daemon.ini pf=/usr/sap/HDB/SYS/profile/HDB_HDB00_prihana
    kill -9 6027 hdbnameserver
    kill -9 6137 hdbcompileserver
    kill -9 6139 hdbpreprocessor
    kill -9 6484 hdbindexserver -port 30003
    kill -9 6494 hdbxsengine -port 30007
    kill -9 7068 hdbwebdispatcher
    kill orphan HDB processes:
    kill -9 6027 [hdbnameserver] <defunct>
    kill -9 6484 [hdbindexserver] <defunct>

Expected result:

  • The cluster detects the stopped primary SAP HANA database (on node 1) and promotes the secondary SAP HANA database (on node 2) to take over as primary.

    [root@prihana ~] pcs status
    Cluster name: rhelhanaha
    Stack: corosync
    Current DC: sechana (version 1.1.19-8.el7_6.5-c3c624ea3d) - partition with quorum
    Last updated: Tue Nov 10 17:58:19 2020
    Last change: Tue Nov 10 17:57:41 2020 by root via crm_attribute on sechana
    2 nodes configured
    6 resources configured
    Online: [ prihana sechana ]
    Full list of resources:
     clusterfence   (stonith:fence_aws):    Started prihana
     Clone Set: SAPHanaTopology_HDB_00-clone [SAPHanaTopology_HDB_00]
         Started: [ prihana sechana ]
     Master/Slave Set: SAPHana_HDB_00-master [SAPHana_HDB_00]
         Masters: [ sechana ]
         Slaves: [ prihana ]
     hana-oip       (ocf::heartbeat:aws-vpc-move-ip):       Started sechana
    Failed Actions:
    * SAPHana_HDB_00_monitor_59000 on prihana 'master (failed)' (9): call=31,
    status=complete, exitreason='',
        last-rc-change='Tue Nov 10 17:56:52 2020', queued=0ms, exec=0ms
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled
    [root@prihana ~]
  • The overlay IP address is migrated to the new primary (on node 2).

  • Because AUTOMATED_REGISTER is set to true, the cluster restarts the failed SAP HANA database and registers it against the new primary.

Recovery procedure:

  • Clean up the cluster "failed actions" on node 1 as root.

    root@prihana ~] pcs resource cleanup SAPHana_HDB_00 --node prihana
  • After resource cleanup, ensure the cluster "failed actions" are cleaned up.