Configuration of multiple queues - Amazon ParallelCluster
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Configuration of multiple queues

Configuration of multiple queues

With Amazon ParallelCluster version 3, you can configure multiple queues by setting the Scheduler to slurm and specifying more than one queue for SlurmQueues in the configuration file. In this mode, different instance types coexist in the compute nodes that are specified in the ComputeResources section of the configuration file. ComputeResources with different instance types are scaled up or down as needed for the SlurmQueues.

Cluster queue and compute resource quotas
Resource Quota

Slurm queues

50 queues per cluster

Compute resources

50 compute resources per queue

50 compute resources per cluster

Node Counts

Each compute resource in ComputeResources for a queue must have a unique Name, InstanceType, MinCount, and MaxCount. MinCount and MaxCount have default values that define the range of instances for a compute resource in ComputeResources for a queue. You can also specify your own values for MinCount and MaxCount. Each compute resource in ComputeResources is composed of static nodes numbered from 1 to the value of MinCount and dynamic nodes numbered from the value of MinCount to the value of MaxCount.

Example Configuration

The following is an example of a Scheduling section for a cluster configuration file. In this configuration there are two queues named queue1 and queue2 and each of the queues has ComputeResources with a specified MaxCount.

Scheduling: Scheduler: slurm SlurmQueues: - Name: queue1 ComputeResources: - InstanceType: c5.xlarge MaxCount: 5 Name: c5xlarge - InstanceType: c4.xlarge MaxCount: 5 Name: c4xlarge - Name: queue2 ComputeResources: - InstanceType: c5.xlarge MaxCount: 5 Name: c5xlarge

Hostnames

The instances that are launched into the compute fleet are dynamically assigned. Hostnames are generated for each node. By default Amazon ParallelCluster will use the following format of the hostname :

$HOSTNAME=$QUEUE-$STATDYN-$COMPUTE_RESOURCE-$NODENUM

  • $QUEUE is the name of the queue. For example, if the SlurmQueues section has an entry with the Name set to “queue-name” then “$QUEUE” is “queue-name”.

  • $STATDYN is st for static nodes or dy for dynamic nodes.

  • $COMPUTE_RESOURCE is the Name of the ComputeResources compute resource corresponding to this node.

  • $NODENUM is the number of the node. $NODENUM is between one (1) and the value of MinCount for static nodes and between one (1) and MaxCount-MinCount for dynamic nodes.

From the example configuration file above, a given node from queue1 and compute resource c5xlarge has a hostname: queue1-dy-c5xlarge-1.

Both hostnames and fully-qualified domain names (FQDN) are created using Amazon Route 53 hosted zones. The FQDN is $HOSTNAME.$CLUSTERNAME.pcluster, where $CLUSTERNAME is the name of the cluster.

Note that the same format will be used for the Slurm node names as well.

Users can choose to use the default EC2 hostname of the instance powering the compute node instead of the default host name format used by Amazon ParallelCluster. This can be done by setting the UseEc2Hostnames parameter to be true. However, Slurm node names will continue to use the default Amazon ParallelCluster format.