Cluster Node Setup - SAP NetWeaver on Amazon
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Cluster Node Setup

Establish cluster communication between nodes using Corosync and configure required authentication.

Change the hacluster Password

On all cluster nodes, change the password of the operating system user hacluster:

# passwd hacluster

Setup Passwordless Authentication

SUSE cluster tools provide comprehensive reporting and troubleshooting capabilities for cluster activity. Many of these tools require passwordless SSH access between nodes to collect cluster-wide information effectively. SUSE recommends configuring passwordless SSH for the root user to enable seamless cluster diagnostics and reporting.

EC2 instances typically have no root password set. Use the shared /sapmnt filesystem to exchange SSH keys:

On the primary node (<hostname1>):

# ssh-keygen -t rsa -b 4096 -f /root/.ssh/id_rsa -N '' # cp /root/.ssh/id_rsa.pub /sapmnt/node1_key.pub

On the secondary node (<hostname2>):

# ssh-keygen -t rsa -b 4096 -f /root/.ssh/id_rsa -N '' # cp /root/.ssh/id_rsa.pub /sapmnt/node2_key.pub # cat /sapmnt/node1_key.pub >> /root/.ssh/authorized_keys # chmod 600 /root/.ssh/authorized_keys

Back on the primary node (<hostname1>):

# cat /sapmnt/node2_key.pub >> /root/.ssh/authorized_keys # chmod 600 /root/.ssh/authorized_keys

Test connectivity from both nodes:

# ssh root@<opposite_hostname> 'hostname'

Clean up temporary files (from either node):

# rm /sapmnt/node1_key.pub /sapmnt/node2_key.pub

An alternative is to review the SUSE Dcoumentation for Running cluster reports without root access

Warning

Review the security implications for your organization, including root access controls and network segmentation, before implementing this configuration.

Configure the Cluster Nodes

Initialize the cluster framework on the first node to recognise both cluster nodes.

On the primary node as root, run:

# crm cluster init -u -n <cluster_name> -N <hostname_1> <hostname_2>

Example using values from Parameter Reference :

# crm cluster init -u -y -n slx-sap-cluster -N slxhost01 -N slxhost02 INFO: Detected "amazon-web-services" platform INFO: Loading "default" profile from /etc/crm/profiles.yml INFO: "amazon-web-services" profile does not exist in /etc/crm/profiles.yml INFO: Configuring csync2 INFO: Starting csync2.socket service on slxhost01 INFO: BEGIN csync2 checking files INFO: END csync2 checking files INFO: Configuring corosync (unicast) WARNING: Not configuring SBD - STONITH will be disabled. INFO: Hawk cluster interface is now running. To see cluster status, open: INFO: https://10.2.10.1:7630/ INFO: Log in with username 'hacluster' INFO: Starting pacemaker.service on slxhost01 INFO: BEGIN Waiting for cluster ........... INFO: END Waiting for cluster INFO: Loading initial cluster configuration INFO: Done (log saved to /var/log/crmsh/crmsh.log on slxhost01) INFO: Adding node slxhost02 to cluster INFO: Running command on slxhost02: crm cluster join -y -c root@slxhost01 INFO: Configuring csync2 INFO: Starting csync2.socket service INFO: BEGIN csync2 syncing files in cluster INFO: END csync2 syncing files in cluster INFO: Merging known_hosts INFO: BEGIN Probing for new partitions INFO: END Probing for new partitions INFO: Hawk cluster interface is now running. To see cluster status, open: INFO: https://10.1.20.7:7630/ INFO: Log in with username 'hacluster' INFO: Starting pacemaker.service on slxhost02 INFO: BEGIN Waiting for cluster INFO: END Waiting for cluster INFO: Set property "priority" in rsc_defaults to 1 INFO: BEGIN Reloading cluster configuration INFO: END Reloading cluster configuration INFO: Done (log saved to /var/log/crmsh/crmsh.log on slxhost02)

This command:

  • Initializes a two-node cluster named myCluster

  • Configures unicast communication (-u)

  • Sets up the basic corosync configuration

  • Automatically joins the second node to the cluster

  • We do not configure SBD as an Amazon Fencing Agent will be used for STONITH in Amazon environments.

  • QDevice configuration is possible but not covered in this document. Refer to SUSE Linux Enterprise High Availability Documentation - QDevice and QNetD.

Modify Generated Corosync Configuration

After initializing the cluster, the generated corosync configuration requires some modification to be optimised for cloud envrironments.

1. Edit the corosync configuration:

# vi /etc/corosync/corosync.conf

The generated file typically looks like this:

# Please read the corosync.conf.5 manual page totem { version: 2 cluster_name: myCluster clear_node_high_bit: yes interface { ringnumber: 0 mcastport: 5405 ttl: 1 } transport: udpu crypto_hash: sha1 crypto_cipher: aes256 token: 5000 # This needs to be changed join: 60 max_messages: 20 token_retransmits_before_loss_const: 10 } logging { fileline: off to_stderr: no to_logfile: yes logfile: /var/log/cluster/corosync.log to_syslog: yes debug: off timestamp: on logger_subsys { subsys: QUORUM debug: off } } nodelist { node { ring0_addr: <node1_primary_ip> # Only single ring configured nodeid: 1 } node { ring0_addr: <node2_primary_ip> # Only single ring configured nodeid: 2 } } quorum { # Enable and configure quorum subsystem (default: off) # see also corosync.conf.5 and votequorum.5 provider: corosync_votequorum expected_votes: 2 two_node: 1 } totem { version: 2 token: 5000 # This needs to be changed transport: udpu interface { ringnumber: 0 mcastport: 5405 } }

2. Modify the configuration to add the second ring and optimize settings:

totem { token: 15000 # Changed from 5000 to 15000 rrp_mode: passive # Added for dual ring support } nodelist { node { ring0_addr: <node1_primary_ip> # Primary network ring1_addr: <node1_secondary_ip> # Added secondary network nodeid: 1 } node { ring0_addr: <node2_primary_ip> # Primary network ring1_addr: <node2_secondary_ip> # Added secondary network nodeid: 2 } }

Example IP configuration:

Network Interface Node 1 Node 2

ring0_addr

10.2.10.1

10.2.20.1

ring1_addr

10.2.10.2

10.2.20.2

3. Synchronize the modified configuration to all nodes:

# csync2 -xvF /etc/corosync/corosync.conf

4. Restart the cluster

# crm cluster restart # ssh root@<hostname2> 'crm cluster restart'

Verify Corosync Configuration

Verify network rings are active:

# corosync-cfgtool -s

Example output:

Printing ring status. Local node ID 1 RING ID 0 id = 10.2.10.1 status = ring 0 active with no faults RING ID 1 id = 10.2.10.2 status = ring 1 active with no faults

Both network rings should report "active with no faults". If either ring is missing, review the corosync configuration and check that /etc/corosync/corosync.conf changes have been synced to the secondary node. You may need to do this manually. Restart the cluster if needed.

Configure Cluster Services

Enable pacemaker to start automatically after reboot:

# systemctl enable pacemaker

Enabling pacemaker also handles corosync through service dependencies. The cluster will start automatically after reboot. For troubleshooting scenarios, you can choose to manually start services after boot instead.

Verify Cluster Status

1. Check pacemaker service status:

# systemctl status pacemaker

2. Verify cluster status:

# crm_mon -1

Example output:

Cluster Summary: * Stack: corosync * Current DC: slxhost01 (version 2.1.5+20221208.a3f44794f) - partition with quorum * 2 nodes configured * 0 resource instances configured Node List: * Online: [ slxhost01 slxhost02 ] Active Resources: * No active resources