Monitoring - Deep Learning AMI
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).


Your DLAMI comes preinstalled with several GPU monitoring tools. This guide also mentions tools that are available to download and install.

  • Monitor GPUs with CloudWatch - a preinstalled utility that reports GPU usage statistics to Amazon CloudWatch.

  • nvidia-smi CLI - a utility to monitor overall GPU compute and memory utilization. This is preinstalled on your Amazon Deep Learning AMI (DLAMI).

  • NVML C library - a C-based API to directly access GPU monitoring and management functions. This used by the nvidia-smi CLI under the hood and is preinstalled on your DLAMI. It also has Python and Perl bindings to facilitate development in those languages. The utility preinstalled on your DLAMI uses the pynvml package from nvidia-ml-py.

  • NVIDIA DCGM - A cluster management tool. Visit the developer page to learn how to install and configure this tool.


Check out NVIDIA's developer blog for the latest info on using the CUDA tools installed your DLAMI: