SageMaker Workflows
As you scale your machine learning (ML) operations, you can use Amazon SageMaker fully managed
workflow services to implement continuous integration and deployment (CI/CD) practices for
your ML lifecycle. With the SageMaker Pipelines SDK, you choose and integrate pipeline steps into a
unified solution that automates the model-building process from data preparation to model
deployment. For Kubernetes based architectures, you can install SageMaker Operators on your
Kubernetes cluster to create SageMaker jobs natively using the Kubernetes API and
command-line Kubernetes tools such as kubectl
. With SageMaker components for Kubeflow
pipelines, you can create and monitor native SageMaker jobs from your Kubeflow Pipelines.
The job parameters, status, and outputs from SageMaker are accessible from the Kubeflow
Pipelines UI. Lastly, if you want to schedule non-interactive batch runs of your Jupyter
notebook, use the notebook-based workflows service to initiate standalone or regular runs on
a schedule you define.
In summary, SageMaker offers the following workflow technologies:
-
Amazon SageMaker Model Building Pipelines: Tool for building and managing ML pipelines.
-
Kubernetes Orchestration: SageMaker custom operators for your Kubernetes cluster and components for Kubeflow Pipelines.
-
SageMaker Notebook Jobs: On demand or scheduled non-interactive batch runs of your Jupyter notebook.
You can also leverage other services that integrate with SageMaker to build your workflow. Options include the following services:
-
Airflow Workflows
: SageMaker APIs to export configurations for creating and managing Airflow workflows. -
Amazon Step Functions
: Multi-step ML workflows in Python that orchestrate SageMaker infrastructure without having to provision your resources separately.
For more information on managing SageMaker training and inference, see Amazon SageMaker
Python SDK Workflows