Configuring DL1 for your custom Amazon Linux 2 AMI
Custom Amazon Linux 2 AMIs in Amazon EKS can support deep learning workloads at scale through additional configuration and Kubernetes add-ons. This document describes the components required to set up a generic Kubernetes solution for an on-premise setup or as a baseline in a larger cloud configuration. To support this function, you will have to perform the following steps in your custom environment:
-
SynapaseAI® Software drivers loaded on the system – These are included in the AMIs available on Github
. The Habana device plugin -- A Daemonset that allows you to automatically enable the registration of Habana devices in your Kubernetes cluster and track device health.
-
Helm 3.x
-
MPI Operator
-
Create and launch a base AMI from Amazon Linux 2, Ubuntu 18, or Ubuntu 20.
-
Follow these instructions
to set up the environment for DL1.