Capacity Blocks for ML - Amazon Elastic Compute Cloud
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Capacity Blocks for ML

Capacity Blocks for ML allow you to reserve highly sought-after GPU instances on a future date to support your short duration machine learning (ML) workloads. Instances that run inside a Capacity Block are automatically placed close together inside Amazon EC2 UltraClusters, for low-latency, petabit-scale, nonblocking networking.

With Capacity Blocks, you can see when GPU instance capacity is available on future dates, and you can schedule a Capacity Block to start at a time that works best for you. When you reserve a Capacity Block, you get predictable capacity assurance for GPU instances while paying only for the amount of time that you need. We recommend Capacity Blocks when you need GPUs to support your ML workloads for days or weeks at a time and don't want to pay for a reservation while your GPU instances aren't in use.

The following are some common use cases for Capacity Blocks.

  • ML model training and fine-tuning – Get uninterrupted access to the GPU instances that you reserved to complete ML model training and fine-tuning.

  • ML experiments and prototypes – Run experiments and build prototypes that require GPU instances for short durations.

Capacity Blocks are currently available for p5.48xlarge and p4d.24xlargeinstances. The p5.48xlarge instances are available in the US East (Ohio) and US East (N. Virginia) Regions. The p4d.24xlargeinstances are available in the US East (Ohio) and the US West (Oregon) Regions. You can reserve a Capacity Block with a reservation start time up to eight weeks in the future.

You can use Capacity Blocks to reserve p5 and p4d instances with the following reservation duration and instance quantity options.

  • Reservation durations for 1-day increments up 14 days total

  • Reservation instance quantity options of 1, 2, 4, 8, 16, 32, or 64 instances

To reserve a Capacity Block, you start by specifying your capacity needs, including the instance type, the number of instances, amount of time, earliest start date, and latest end date that you need. Then, you can see an available Capacity Block offering that meets your specifications. The Capacity Block offering includes details such as start time, Availability Zone, and reservation price. The price of a Capacity Block offering depends on available supply and demand at the time the offering was delivered. After you reserve a Capacity Block, the price doesn't change. For more information, see Capacity Blocks pricing and billing.

When you purchase a Capacity Block offering, your reservation is created for the date and number of instances that you selected. When your Capacity Block reservation begins, you can target instance launches by specifying the reservation ID in your launch requests.

You can use all the instances you reserved until 30 minutes before the end time of the Capacity Block. With 30 minutes left in your Capacity Block reservation, we begin terminating any instances that are running in the Capacity Block. We use this time to clean up your instances before delivering the Capacity Block to the next customer. The last 30 minutes of the reservation are not charged in the price of the Capacity Block. We emit an event through EventBridge 10 minutes before the termination process begins. For more information, see Monitor Capacity Blocks with EventBridge.

Supported platforms

Capacity Blocks for ML currently support p5.48xlarge and p4d.24xlarge instances with default tenancy. When you use the Amazon Web Services Management Console to purchase a Capacity Block, the default platform option is Linux/UNIX. When you use the Amazon Command Line Interface (Amazon CLI) or Amazon SDK to purchase a Capacity Block, the following platform options are available:

  • Linux/Unix

  • Red Hat Enterprise Linux

  • RHEL with HA

  • SUSE Linux

  • Ubuntu Pro

Considerations

Before you use Capacity Blocks, consider the following details and limitations.

  • Capacity Blocks start and end at 11:30AM Coordinated Universal Time (UTC).

  • The termination process for instances running in a Capacity Block begins at 11:00AM Coordinated Universal Time (UTC) on the final day of the reservation.

  • Capacity Blocks can be reserved with a start time up to 8 weeks in the future.

  • Capacity Block modifications and cancellations aren't allowed.

  • Capacity Blocks can't be shared across Amazon accounts or within your Amazon Organization.

  • Capacity Blocks can't be used in a capacity reservation group.

  • The total number of instances that can be reserved in Capacity Blocks across all accounts in your Amazon Organization can't exceed 64 instances on a particular date.

  • To use a Capacity Block, instances must specifically target the reservation ID.

  • Instances in a Capacity Block don't count against your On-Demand Instances limits.

  • For P5 instances using a custom AMI, ensure that you have the required software and configuration for EFA.

  • Capacity Blocks currently can't be used with Amazon EKS managed node groups or Karpenter. For more information about how to create an Amazon EKS self-managed node group, see Capacity Blocks for ML in the Amazon EKS User Guide.

After you create a Capacity Block, you can do the following with the Capacity Block:

For more information about Amazon ParallelCluster, see What is Amazon ParallelCluster.