Deploy foundation models and custom fine-tuned models - Amazon SageMaker AI
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Deploy foundation models and custom fine-tuned models

Whether you're deploying pre-trained foundation open-weights or gated models from Amazon SageMaker JumpStart or your own custom or fine-tuned models stored in Amazon S3 or Amazon FSx, SageMaker HyperPod provides the flexible, scalable infrastructure you need for production inference workloads.

Deploy open-weights and gated foundation models from JumpStart Deploy custom and fine-tuned models from Amazon S3 and Amazon FSx
Description

Deploy from a comprehensive catalog of pre-trained foundation models with automatic optimization and scaling policies tailored to each model family.

Bring your own custom and fine-tuned models and leverage SageMaker HyperPod's enterprise infrastructure for production-scale inference. Choose between cost-effective storage with Amazon S3 or a high-performance file system with Amazon FSx.
Key benefits
  • One-click deployment through Amazon SageMaker Studio UI

  • Auto-scaling based on incoming requests automatically enabled

  • Pre-optimized containers and configurations for each model family

  • EULA handling for gated models

  • Support for multiple storage backends: Amazon S3, Amazon FSx

  • Flexible container and framework support

  • Custom scaling policies based on your model's characteristics

Deployment options
  • Amazon SageMaker Studio for visual deployment

  • kubectl for Kubernetes-native operations

  • Python SDK for programmatic integration

  • HyperPod CLI for command-line automation

  • kubectl for Kubernetes-native operations

  • Python SDK for programmatic integration

  • HyperPod CLI for command-line automation

The following sections step you through deploying models from Amazon SageMaker JumpStart and from Amazon S3 and Amazon FSx.