Model parallelism and large model inference

Amazon SageMaker AI includes specialized deep learning containers (DLCs), libraries, and tooling for model parallelism and large model inference (LMI). In the following sections, you can find resources to get started with LMI on SageMaker AI.

Topics

The large model inference (LMI) container documentation
SageMaker AI endpoint parameters for large model inference
Deploying uncompressed models
Deploy large models for inference with TorchServe

Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Troubleshooting

The LMI container documentation