Scenario 3: Spot Instance running multi-node jobs is interrupted - Amazon ParallelCluster
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Scenario 3: Spot Instance running multi-node jobs is interrupted

The job fails with a state code of NODE_FAIL, and the job is requeued (unless --no-requeue was specified when the job was submitted). If the node is a static node, it's replaced. If the node is a dynamic node, the node is terminated and reset. Other nodes that were running the terminated jobs might be allocated to other pending jobs, or scaled down after the configured SlurmSettings / ScaledownIdletime time has passed.