Important Changes to DLAMI - Deep Learning AMI
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Important Changes to DLAMI

Frequently Asked Questions

What is changing?

On 11/15/2023 Amazon Deep Learning AMI (DLAMIs) will be split into two separate groups:

  • DLAMIs that use Nvidia proprietary driver (to support P3, P3dn, G3).

  • DLAMIs that use Nvidia OSS driver (to support G4dn, G5, P4, P5).

As a result, new DLAMIs will be created for each of the two categories with new names and new AMI IDs. These DLAMIs will not be interchangeable - i.e. DLAMIs from one group will not support instances supported by the other group e.g. the DLAMI supporting p5 will not support g3 and vice-versa.

DLAMI fork

Why is this change required?

Currently DLAMIs for NVIDIA GPUs include a proprietary kernel driver from NVIDIA. However, recently the upstream Linux kernel community accepted a change that isolates proprietary kernel drivers, such as the NVIDIA GPU driver, from communicating with other kernel drivers. This change disables GPUDirect RDMA on P4/P5 series instances, which is the mechanism that allows GPUs to efficiently use EFA for distributed training. As a result DLAMIs will use OpenRM driver (NVIDIA open source driver), linked against the open source EFA drivers to support G4dn,G5, P4 and P5. However, this OpenRM driver won’t support older instances (P3, G3 etc.) Therefore, in order to ensure that we continue to provide current, performant and secure DLAMIs supporting both types of instances we will split DLAMIs into two groups - one with the OpenRM driver (supporting G4dn,G5, P4 and P5 ) and one with the older proprietary driver (supporting older instances P3, P3dn, G3).

Which DLAMIs are affected by this change?

All DLAMIs are affected by this change.

What does this mean for you?

The new DLAMIs will continue to provide functionality, performance and security of the current DLAMIs as long as they are run on a compatible instance type. If you are using DLAMIs then you will need to ensure that a DLAMI is launched on one of the compatible instances mentioned in the release notes of each DLAMI (see here). For example: you will need to accommodate this change to:

  • Invoke DLAMIs with the right CLI queries (see below)

  • Launch DLAMIs from console and CLI on a compatible instance type

If you are launching DLAMIs from EC2 console Quickstart: Each DLAMI description lists the types of instances supported in EC2 console. You should launch the DLAMIs on compatible instances.

EC2 quickstart

If you are launching DLAMIs using CLI then you will have to modify your queries. For example:

Currently the following CLI query is used for base DLAMIs that support all instances [P3, P3dn, G3, G4dn, G5, P4, P5]:

aws ec2 describe-images --region us-east-1 --owners amazon \ --filters 'Name=name,Values=Deep Learning Base AMI (Amazon Linux 2) ????????' 'Name=state,Values=available' \ --query 'reverse(sort_by(Images, &CreationDate))[:1].ImageId' --output text

The new CLI queries will be:

For base DLAMI supporting P3, P3dn, and G3:

aws ec2 describe-images --region us-east-1 --owners amazon \ --filters 'Name=name,Values=Deep Learning Base Proprietary Nvidia Driver AMI (Amazon Linux 2) Version ??.?' 'Name=state,Values=available' \ --query 'reverse(sort_by(Images, &CreationDate))[:1].ImageId' --output text

For base DLAMI supporting G4dn, G5, P4, and P5 :

aws ec2 describe-images --region us-east-1 --owners amazon \ --filters 'Name=name,Values=Deep Learning Base OSS Nvidia Driver AMI (Amazon Linux 2) Version ??.?' 'Name=state,Values=available' \ --query 'reverse(sort_by(Images, &CreationDate))[:1].ImageId' --output text

Please refer to updated release notes for new AMIs here. For how to launch AMIs on EC2 instances please refer to instructions here.

When should you start using the new DLAMIs?

You should start using the new DLAMIs as soon as possible for the latest frameworks, dependencies, patches and functionality. Optionally, if you are using Amazon Linux 2 DLAMIs released before 11/8/2023, then you may choose to continue live patching their DLAMIs (see instructions here) until 11/30/2023.

Will there be any loss in functionality with the new DLAMIs?

No, there is no loss of functionality with the new DLAMIs. The new DLAMIs after the split will continue to provide all functionality, performance and security of the old DLAMIs before split, as long as they are run on a compatible instance. We are splitting the DLAMIs into two groups so that we continue to offer DLAMIs that are current, performant and secure for your use on a broad range of instances.

What about DLCs?

DLCs do not include the NVIDIA driver so they are not affected by this change. But you should ensure that the DLCs are run on AMIs that are compatible with the underlying instances.