Use foundation models with the SageMaker Python SDK - Amazon SageMaker
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Use foundation models with the SageMaker Python SDK

All JumpStart foundation models are available to deploy programmatically using the SageMaker Python SDK. Publicly available text generation foundation models can be deployed using the model ID in the Publicly available text generation model table. Proprietary models must be deployed using the model package information after subscribing to the model in Amazon Web Services Marketplace.

The following sections show how to fine-tune foundation models using the JumpStartEstimator class and how to deploy models using the JumpStartModel class, along with additional Python SDK utilities.

Important

Some foundation models require explicit acceptance of an end-user license agreement (EULA). For more information, see EULA acceptance with the SageMaker Python SDK.

To reference available model IDs for all publicly available foundation models, see the Built-in Algorithms with pre-trained Model Table. Search for the name of the foundation model of your choice in the Search bar, change the number of entries shown using the Show entries dropdown menu, or choose the Next text highlighted in blue on the left side of the page to navigate through the available models.

Fine-tune publicly available foundation models with the JumpStartEstimator class

You can fine-tune a built-in algorithm or pre-trained model in just a few lines of code using the SageMaker Python SDK.

  1. First, find the model ID for the model of your choice in the Built-in Algorithms with pre-trained Model Table.

  2. Using the model ID, define your training job as a JumpStart estimator.

    from sagemaker.jumpstart.estimator import JumpStartEstimator model_id = "huggingface-textgeneration1-gpt-j-6b" estimator = JumpStartEstimator(model_id=model_id)
  3. Run estimator.fit() on your model, pointing to the training data to use for fine-tuning.

    estimator.fit( {"train": training_dataset_s3_path, "validation": validation_dataset_s3_path} )
  4. Then, use the deploy method to automatically deploy your model for inference. In this example, we use the GPT-J 6B model from Hugging Face.

    predictor = estimator.deploy()
  5. You can then run inference with the deployed model using the predict method.

    question = "What is Southern California often abbreviated as?" response = predictor.predict(question) print(response)
Note

This example uses the foundation model GPT-J 6B, which is suitable for a wide range of text generation use cases including question answering, named entity recognition, summarization, and more. For more information about model use cases, see Explore the latest foundation models.

You can optionally specify model versions or instance types when creating your JumpStartEstimator. For more information about the JumpStartEstimator class and its parameters, see JumpStartEstimator.

Check default instance types

You can optionally include specific model versions or instance types when fine-tuning a pre-trained model using the JumpStartEstimator class. All JumpStart models have a default instance type. Retrieve the default training instance type using the following code:

from sagemaker import instance_types instance_type = instance_types.retrieve_default( model_id=model_id, model_version=model_version, scope="training") print(instance_type)

You can see all supported instance types for a given JumpStart model with the instance_types.retrieve() method.

Check default hyperparameters

To check the default hyperparameters used for training, you can use the retrieve_default() method from the hyperparameters class.

from sagemaker import hyperparameters my_hyperparameters = hyperparameters.retrieve_default(model_id=model_id, model_version=model_version) print(my_hyperparameters) # Optionally override default hyperparameters for fine-tuning my_hyperparameters["epoch"] = "3" my_hyperparameters["per_device_train_batch_size"] = "4" # Optionally validate hyperparameters for the model hyperparameters.validate(model_id=model_id, model_version=model_version, hyperparameters=my_hyperparameters)

For more information on available hyperparameters, see Commonly supported fine-tuning hyperparameters.

Check default metric definitions

You can also check the default metric definitions:

print(metric_definitions.retrieve_default(model_id=model_id, model_version=model_version))

Deploy publicly available foundation models with the JumpStartModel class

You can deploy a built-in algorithm or pre-trained model to a SageMaker endpoint in just a few lines of code using the SageMaker Python SDK.

  1. First, find the model ID for the model of your choice in the Built-in Algorithms with pre-trained Model Table.

  2. Using the model ID, define your model as a JumpStart model.

    from sagemaker.jumpstart.model import JumpStartModel model_id = "huggingface-text2text-flan-t5-xl" my_model = JumpStartModel(model_id=model_id)
  3. Use the deploy method to automatically deploy your model for inference. In this example, we use the FLAN-T5 XL model from Hugging Face.

    predictor = my_model.deploy()
  4. You can then run inference with the deployed model using the predict method.

    question = "What is Southern California often abbreviated as?" response = predictor.predict(question) print(response)
Note

This example uses the foundation model FLAN-T5 XL, which is suitable for a wide range of text generation use cases including question answering, summarization, chatbot creation, and more. For more information about model use cases, see Explore the latest foundation models.

For more information about the JumpStartModel class and its parameters, see JumpStartModel.

Check default instance types

You can optionally include specific model versions or instance types when deploying a pre-trained model using the JumpStartModel class. All JumpStart models have a default instance type. Retrieve the default deployment instance type using the following code:

from sagemaker import instance_types instance_type = instance_types.retrieve_default( model_id=model_id, model_version=model_version, scope="inference") print(instance_type)

See all supported instance types for a given JumpStart model with the instance_types.retrieve() method.

Use inference components to deploy multiple models to a shared endpoint

An inference component is a SageMaker hosting object that you can use to deploy one or more models to an endpoint for increased flexibility and scalability. You must change the endpoint_type for your JumpStart model to be inference-component-based rather than the default model-based endpoint.

predictor = my_model.deploy( endpoint_name = 'jumpstart-model-id-123456789012', endpoint_type = EndpointType.INFERENCE_COMPONENT_BASED )

For more information on creating endpoints with inference components and deploying SageMaker models, see Shared resource utilization with multiple models.

Check valid input and output inference formats

To check valid data input and output formats for inference, you can use the retrieve_options() method from the Serializers and Deserializers classes.

print(sagemaker.serializers.retrieve_options(model_id=model_id, model_version=model_version)) print(sagemaker.deserializers.retrieve_options(model_id=model_id, model_version=model_version))

Check supported content and accept types

Similarly, you can use the retrieve_options() method to check the supported content and accept types for a model.

print(sagemaker.content_types.retrieve_options(model_id=model_id, model_version=model_version)) print(sagemaker.accept_types.retrieve_options(model_id=model_id, model_version=model_version))

For more information about utilities, see Utility APIs.

Use proprietary foundation models with the SageMaker Python SDK

Proprietary models must be deployed using the model package information after subscribing to the model in Amazon Web Services Marketplace. For more information about SageMaker and Amazon Web Services Marketplace, see Buy and Sell Amazon SageMaker Algorithms and Models in Amazon Web Services Marketplace. To find Amazon Web Services Marketplace links for the latest proprietary models, see Getting started with Amazon SageMaker JumpStart.

After subscribing to the model of your choice in Amazon Web Services Marketplace, you can deploy the foundation model using the SageMaker Python SDK and the SDK associated with the model provider. For example, AI21 Labs, Cohere, and LightOn use the "ai21[SM]", cohere-sagemaker, and lightonsage packages, respectively.

For example, to define a JumpStart model using Jurassic-2 Jumbo Instruct from AI21 Labs, use the following code:

import sagemaker import ai21 role = get_execution_role() sagemaker_session = sagemaker.Session() model_package_arn = "arn:aws:sagemaker:us-east-1:865070037744:model-package/j2-jumbo-instruct-v1-1-43-4e47c49e61743066b9d95efed6882f35" my_model = ModelPackage( role=role, model_package_arn=model_package_arn, sagemaker_session=sagemaker_session )

For step-by-step examples, find and run the notebook associated with the proprietary foundation model of your choice in SageMaker Studio Classic. See Use foundation models in Amazon SageMaker Studio Classic for more information. For more information on the SageMaker Python SDK, see ModelPackage.