Stop the SAP HANA database on the secondary node
Description — Stop the primary SAP HANA database (on Node 2) during normal cluster operation.
Run node — Primary SAP HANA database node (on Node 2)
Run steps:
-
Stop the SAP HANA database gracefully as
<sid>adm
on node 2.sechana:~ # su - hdbadm hdbadm@sechana:/usr/sap/HDB/HDB00> HDB stop hdbdaemon will wait maximal 300 seconds for NewDB services finishing. Stopping instance using: /usr/sap/HDB/SYS/exe/hdb/sapcontrol -prot NI_HTTP -nr 00 -function Stop 400 12.11.2020 11:45:21 Stop OK Waiting for stopped instance using: /usr/sap/HDB/SYS/exe/hdb/sapcontrol -prot NI_HTTP -nr 00 -function WaitforStopped 600 2 12.11.2020 11:45:53 WaitforStopped OK hdbdaemon is stopped.
Expected result:
-
The cluster detects stopped primary SAP HANA database (on node 2) and promotes the secondary SAP HANA database (on node 1) to take over as primary.
sechana:~ # crm status Stack: corosync Current DC: prihana (version 1.1.18+20180430.b12c320f5-3.24.1-b12c320f5) - partition with quorum Last updated: Thu Nov 12 11:47:38 2020 Last change: Thu Nov 12 11:47:33 2020 by root via crm_attribute on prihana 2 nodes configured 6 resources configured Online: [ prihana sechana ] Full list of resources: res_AWS_STONITH (stonith:external/ec2): Started prihana res_AWS_IP (ocf::suse:aws-vpc-move-ip): Started prihana Clone Set: cln_SAPHanaTopology_HDB_HDB00 [rsc_SAPHanaTopology_HDB_HDB00] Started: [ prihana sechana ] Master/Slave Set: msl_SAPHana_HDB_HDB00 [rsc_SAPHana_HDB_HDB00] Masters: [ prihana ] Slaves: [ sechana ] Failed Actions: * rsc_SAPHana_HDB_HDB00_monitor_60000 on sechana 'master (failed)' (9): call=46, status=complete, exitreason='', last-rc-change='Thu Nov 12 11:46:45 2020', queued=0ms, exec=0ms
-
The overlay IP address is migrated to the new primary (on node 1).
prihana:~ # ip addr show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000 link/ether 0a:38:1c:ce:b4:3d brd ff:ff:ff:ff:ff:ff inet 11.0.1.132/24 brd 11.0.1.255 scope global eth0 valid_lft forever preferred_lft forever inet 11.0.1.75/32 scope global eth0:1 valid_lft forever preferred_lft forever inet 192.168.10.16/32 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::838:1cff:fece:b43d/64 scope link valid_lft forever preferred_lft forever
-
With the
AUTOMATIC_REGISTER
parameter set to "true
", the cluster restarts the failed SAP HANA database and automatically registers it against the new primary.
Recovery procedure:
-
After you run the
crm
command to clean up the resource, "failed actions
" messages should disappear from the cluster status.sechana:~ # crm resource cleanup rsc_SAPHana_HDB_HDB00 sechana Cleaned up rsc_SAPHana_HDB_HDB00:0 on sechana Cleaned up rsc_SAPHana_HDB_HDB00:1 on sechana Waiting for 1 replies from the CRMd. OK
-
After resource cleanup, the cluster “
failed actions
” are cleaned up.sechana:~ # crm status Stack: corosync Current DC: prihana (version 1.1.18+20180430.b12c320f5-3.24.1-b12c320f5) - partition with quorum Last updated: Thu Nov 12 11:50:05 2020 Last change: Thu Nov 12 11:49:39 2020 by root via crm_attribute on prihana 2 nodes configured 6 resources configured Online: [ prihana sechana ] Full list of resources: res_AWS_STONITH (stonith:external/ec2): Started prihana res_AWS_IP (ocf::suse:aws-vpc-move-ip): Started prihana Clone Set: cln_SAPHanaTopology_HDB_HDB00 [rsc_SAPHanaTopology_HDB_HDB00] Started: [ prihana sechana ] Master/Slave Set: msl_SAPHana_HDB_HDB00 [rsc_SAPHana_HDB_HDB00] Masters: [ prihana ] Slaves: [ sechana ]