Use Capacity Blocks for machine learning workloads
Capacity Blocks help you reserve highly sought-after GPU instances on a future date to support your short-duration, machine learning (ML) workloads.
For an overview of Capacity Blocks and how they work, see Capacity Blocks for ML in the Amazon EC2 User Guide.
To start using Capacity Blocks, you create a capacity reservation
in a specific Availability Zone. Capacity Blocks are delivered as
targeted
capacity reservations in a single Availability Zone. When
you create your launch template, specify the Capacity Block's reservation ID and
instance type. Then, update your Auto Scaling group to use the launch template you created
and the Capacity Block's Availability Zone. When your Capacity Block reservation
begins, use scheduled scaling to launch the same number of instances as your
Capacity Block reservation.
Important
Capacity Blocks are only available for certain Amazon EC2 instance types and Amazon Web Services Regions. For more information, see Prerequisites in the Amazon EC2 User Guide.
Contents
Operational guidelines
The following are basic operational guidelines that you should follow when using a Capacity Block with an Auto Scaling group.
-
Scale in your Auto Scaling group to zero more than 30 minutes before the Capacity Block reservation end time. Amazon EC2 will terminate any instances that are still running 30 minutes before the end time of the Capacity Block.
-
We recommend that you use scheduled scaling to scale out (add instances) and scale in (remove instances) at the appropriate reservation times. For more information, see Scheduled scaling for Amazon EC2 Auto Scaling.
-
Add lifecycle hooks as needed to perform a graceful shutdown of your application inside the instances when scaling in. Leave enough time for the lifecycle action to complete before Amazon EC2 starts forcibly terminating your instances 30 minutes before the Capacity Block reservation end time. For more information, see Amazon EC2 Auto Scaling lifecycle hooks.
-
Make sure that the Auto Scaling group points to the correct version of the launch template for the entire duration of the reservation. We recommend pointing to a specific version of the launch template instead of the
$Default
or$Latest
version.
Note
If you leave a Capacity Block instance running until the end of the
reservation and Amazon EC2 reclaims it, the scaling activities for your Auto Scaling
group state that it was "taken out of service in
response to an EC2 health check that indicated it had been
terminated or stopped
", even though it was
purposely reclaimed at the end of the Capacity Block. Similarly, Amazon EC2 Auto Scaling
will attempt to replace the instance in the same manner as it does for any
instance that fails a health check. For more information, see Health checks for instances in an Auto Scaling
group.
Specify a Capacity Block in your launch template
To create a launch template that targets a specific Capacity Block for your Auto Scaling group, use one of the following methods:
Limitations
-
Support for Capacity Blocks is only available if your Auto Scaling group has a compatible configuration. Mixed instances groups and warm pools are not supported.
-
You can only target one Capacity Block at a time.
Related resources
-
For the prerequisites and recommendations for using P5 Instances, see Get started with P5 instances in the Amazon EC2 User Guide.
-
Amazon EKS supports using Capacity Blocks to support your short duration, machine learning (ML) workloads on Amazon EKS clusters. For more information, see Capacity Blocks for ML in the Amazon EKS User Guide.
-
You can use Capacity Blocks with supported instance types and Regions. However, On-Demand Capacity Reservations provide flexibility to reserve capacity for other instances types and Regions. For a tutorial that shows you how to use the On-Demand Capacity Reservation option, see Reserve capacity in specific Availability Zones with Capacity Reservations .