Creating a cluster with an external SlurmDB accounting - Amazon ParallelCluster
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Creating a cluster with an external SlurmDB accounting

Learn how to configure and create a cluster with external SlurmDB accounting. For more information, see Slurm accounting with Amazon ParallelCluster.

When using the Amazon ParallelCluster command line interface (CLI) or API, you only pay for the Amazon resources that are created when you create or update Amazon ParallelCluster images and clusters. For more information, see Amazon services used by Amazon ParallelCluster.

The Amazon ParallelCluster UI is built on a serverless architecture and you can use it within the Amazon Free Tier category for most cases. For more information, see Amazon ParallelCluster UI costs.

In this tutorial, you use a Amazon CloudFormation quick-create template to create the necessary components to deploy a Slurmdbd instance on the same VPC as the cluster. The template creates a basic networking and security configuration for the connection between the cluster and the database.

Note

Starting with version 3.10.0, Amazon ParallelCluster supports external Slurmdbd with the cluster configuration parameter SlurmSettings / ExternelSlurmdbd.

Note

The quick-create template serves as an example. This template doesn't cover all possible use cases. It's your responsibility to create an external Slurmdbd with the configuration and capacity appropriate for your production workloads.

Prerequisites:

Step 1: Create the Slurmdbd stack

In this tutorial, use a CloudFormation quick-create template (us-east-1) to create a Slurmdbd stack. The template requires following inputs:

Networking
  • VPCId: The VPC ID to launch the Slurmdbd instance.

  • SubnetId: The Subnet ID to launch the Slurmdbd instance.

  • PrivatePrefix: The CIDR prefix of the VPC.

  • PrivateIp: A secondary private IP to assign to the Slurmdbd instance.

Database connection
  • DBMSClientSG: The security group to be attach to the Slurmdbd instance. This security group should allows connections between the database server and the Slurmdbd instance.

  • DBMSDatabaseName: The name of the database.

  • DBMSUsername: The username to the database.

  • DBMSPasswordSecretArn: The secret containing the password to the database.

  • DBMSUri: The URI of the database server.

Instance settings
  • InstanceType: An instance type to use for the slurmdbd instance.

  • KeyName: An Amazon EC2 key pair to use for the slurmdbd instance.

Slurmdbd settings
  • AMIID: An AMI of the Slurmdbd instance. The AMI should be a ParallelCluster AMI. The version of the ParallelCluster AMI determines the version of Slurmdbd.

  • MungeKeySecretArn: The secret containing the munge key to use for authenticating communications between Slurmdbd and clusters.

  • SlurmdbdPort: A port number that the slurmdbd uses.

  • EnableSlurmdbdSystemService: Enables slurmdbd as system service and have it run when an instance launches.

Warning

If the database was created by a different version of SlurmDB, do not use Slurmdbd as a system service.

If the database contains a large number of entries, the Slurm Database Daemon (SlurmDBD) may require tens of minutes to update the database and be unresponsive during this time interval.

Before upgrading SlurmDB, make a backup of the database. For more information, see the Slurm documentation.

Step 2: Create a cluster with external Slurmdbd enabled

The provided Amazon CloudFormation template generates a Amazon CloudFormation stack with some defined outputs.

From the Amazon Web Services Management Console, view the Outputs tab in the Amazon CloudFormation stack to review the entities created. To enable the Slurm accounting, some of these outputs must be used in the Amazon ParallelCluster configuration file:

Additional, from the Parameters tab in the Amazon CloudFormation stack view:

Update your cluster configuration file database parameters with the output values. Use the pcluster Amazon CLI to create the cluster.

$ pcluster create-cluster -n cluster-3.x-c path/to/cluster-config.yaml

After the cluster is created, you can start using Slurm accounting commands such as sacctmgr or sacct.

Warning

Traffic between ParallelCluster and the external SlurmDB is not encrypted. It is recommended to run the cluster and the external SlurmDB in a trusted network.