Amazon SageMaker Features - Amazon SageMaker
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Amazon SageMaker Features

Amazon SageMaker includes the following features.

New features for re:Invent 2023

SageMaker includes the following new features for re:Invent 2023.

SageMaker Canvas chat for data prep

SageMaker Canvas chat for data prep helps you create data preparation flows using LLMs.

Code Editor

Code Editor extends Studio so that you can write, test, debug and run your analytics and machine learning code in an environment based on Visual Studio Code - Open Source ("Code-OSS").

Deep learning containers for large model inference

SageMaker has replace the default NCCL kernels with inference optimized kernels to improve GPU utilization and offer differentiating performance against OSS.

Deploy models for real-time inference

SageMaker Inference provides developer experience and user interface abstractions to help you get started more quickly with model deployment.

SageMaker customers can now improve the utilization of their accelerated compute instances by deploying up to thousands of models to a SageMaker endpoint with guaranteed throughput and auto-scaling on a per model basis.

SageMaker Distribution Images

SageMaker Distribution is a collection of Docker images designed for machine learning, data science, and data analytics. The images are available across Studio, Studio Lab, Studio notebooks and Github.

domain onboarding simplification

A simplified and guided Amazon SageMaker domain onboarding experience with new capabilities for single users and organization administrators. The capabilities includes direct IAM Identity Center integration, fine-grained access policy management, seamless SageMaker apps management and configurations, and VPC and storage configuration.

Amazon S3 Express One Zone

Amazon S3 Express One Zone is new storage class that provides single-digit millisecond access for the most latency-sensitive applications. Amazon S3 Express One Zone allows customers to collocate their object storage and compute resources in a single Amazon Availability Zone, optimizing both compute performance and costs with increased data processing speed.

Foundation model evaluations (FMEval)

Foundation model evaluations (FMEval) helps you quantify the risk of providing inaccurate, toxic or biased content with your language model so that you can choose the best one for your use case. Bring your own custom dataset or use a built-in to evaluate any language model. FMEval is integrated with tens of text-based foundation models in SageMaker JumpStart or bring your own. You can also create customized evaluations using the FMEval library.

SageMaker HyperPod

SageMaker HyperPod is a capability of SageMaker that provides an always-on machine learning environment on resilient clusters that you can run any machine learning workloads for developing large machine learning models such as large language models (LLMs) and diffusion models.

JupyterAI

Jupyter AI and Code Whisperer have been included to SageMaker Distribution. With this update, users of Studio or Code Editor can easily use generative AI from their notebooks and take advantage of Code Whisperer's code completion feature.

JupyterLab in Studio

JupyterLab in Studio improves latency and reliability for Studio Notebooks

SageMaker Notebook Jobs

SageMaker Notebook Jobs provides SDK support for notebook jobs so you can schedule your notebook jobs programmatically.

SageMaker Pipelines

SageMaker Pipelines provides you the option to convert your local machine learning code to a SageMaker Pipeline step, from which you can create and run a pipeline.

SageMaker smart sifting

SageMaker smart sifting is a capability of SageMaker Training that improves the efficiency of your training datasets and reduces total training time and cost.

SageMaker Studio

Studio is the latest web-based experience for running ML workflows. Studio offers a suite of IDEs, including Code Editor, a new Jupyterlab application, RStudio, and Studio Classic.

Machine learning environments

SageMaker includes the following machine learning environments.

SageMaker geospatial capabilities

Build, train, and deploy ML models using geospatial data.

SageMaker Canvas

An auto ML service that gives people with no coding experience the ability to build models and make predictions with them.

SageMaker Studio

An integrated machine learning environment where you can build, train, deploy, and analyze your models all in the same application.

SageMaker Studio Lab

A free service that gives customers access to Amazon compute resources in an environment based on open-source JupyterLab.

RStudio on Amazon SageMaker

An integrated development environment for R, with a console, syntax-highlighting editor that supports direct code execution, and tools for plotting, history, debugging and workspace management.

Major features

SageMaker includes the following major features in alphabetical order excluding any SageMaker prefix.

Amazon Augmented AI

Build the workflows required for human review of ML predictions. Amazon A2I brings human review to all developers, removing the undifferentiated heavy lifting associated with building human review systems or managing large numbers of human reviewers.

AutoML step

Create an AutoML job to automatically train a model in SageMaker Pipelines.

SageMaker Autopilot

Users without machine learning knowledge can quickly build classification and regression models.

Batch Transform

Preprocess datasets, run inference when you don't need a persistent endpoint, and associate input records with inferences to assist the interpretation of results.

SageMaker Clarify

Improve your machine learning models by detecting potential bias and help explain the predictions that models make.

Collaboration with shared spaces

A shared space consists of a shared JupyterServer application and a shared directory. All user profiles in a Amazon SageMaker domain have access to all shared spaces in the domain.

SageMaker Data Wrangler

Import, analyze, prepare, and featurize data in SageMaker Studio. You can integrate Data Wrangler into your machine learning workflows to simplify and streamline data pre-processing and feature engineering using little to no coding. You can also add your own Python scripts and transformations to customize your data prep workflow.

Data Wrangler data preparation widget

Interact with your data, get visualizations, explore actionable insights, and fix data quality issues.

SageMaker Debugger

Inspect training parameters and data throughout the training process. Automatically detect and alert users to commonly occurring errors such as parameter values getting too large or small.

SageMaker Edge Manager

Optimize custom models for edge devices, create and manage fleets and run models with an efficient runtime.

SageMaker Elastic Inference

Speed up the throughput and decrease the latency of getting real-time inferences.

SageMaker Experiments

Experiment management and tracking. You can use the tracked data to reconstruct an experiment, incrementally build on experiments conducted by peers, and trace model lineage for compliance and audit verifications.

SageMaker Feature Store

A centralized store for features and associated metadata so features can be easily discovered and reused. You can create two types of stores, an Online or Offline store. The Online Store can be used for low latency, real-time inference use cases and the Offline Store can be used for training and batch inference.

SageMaker Ground Truth

High-quality training datasets by using workers along with machine learning to create labeled datasets.

SageMaker Ground Truth Plus

A turnkey data labeling feature to create high-quality training datasets without having to build labeling applications and manage the labeling workforce on your own.

SageMaker Inference Recommender

Get recommendations on inference instance types and configurations (e.g. instance count, container parameters and model optimizations) to use your ML models and workloads.

Inference shadow tests

Evaluate any changes to your model-serving infrastructure by comparing its performance against the currently deployed infrastructure.

SageMaker JumpStart

Learn about SageMaker features and capabilities through curated 1-click solutions, example notebooks, and pretrained models that you can deploy. You can also fine-tune the models and deploy them.

SageMaker ML Lineage Tracking

Track the lineage of machine learning workflows.

SageMaker Model Building Pipelines

Create and manage machine learning pipelines integrated directly with SageMaker jobs.

SageMaker Model Cards

Document information about your ML models in a single place for streamlined governance and reporting throughout the ML lifecycle.

SageMaker Model Dashboard

A pre-built, visual overview of all the models in your account. Model Dashboard integrates information from SageMaker Model Monitor, transform jobs, endpoints, lineage tracking, and CloudWatch so you can access high-level model information and track model performance in one unified view.

SageMaker Model Monitor

Monitor and analyze models in production (endpoints) to detect data drift and deviations in model quality.

SageMaker Model Registry

Versioning, artifact and lineage tracking, approval workflow, and cross account support for deployment of your machine learning models.

SageMaker Neo

Train machine learning models once, then run anywhere in the cloud and at the edge.

Notebook-based Workflows

Run your SageMaker Studio notebook as a non-interactive, scheduled job.

Preprocessing

Analyze and preprocess data, tackle feature engineering, and evaluate models.

SageMaker Projects

Create end-to-end ML solutions with CI/CD by using SageMaker projects.

Reinforcement Learning

Maximize the long-term reward that an agent receives as a result of its actions.

SageMaker Role Manager

Administrators can define least-privilege permissions for common ML activities using custom and preconfigured persona-based IAM roles.

SageMaker Serverless Endpoints

A serverless endpoint option for hosting your ML model. Automatically scales in capacity to serve your endpoint traffic. Removes the need to select instance types or manage scaling policies on an endpoint.

Studio Classic Git extension

A Git extension to enter the URL of a Git repository, clone it into your environment, push changes, and view commit history.

SageMaker Studio Notebooks

The next generation of SageMaker notebooks that include Amazon IAM Identity Center (IAM Identity Center) integration, fast start-up times, and single-click sharing.

The next generation of SageMaker notebooks that include fast start-up times and single-click sharing.

SageMaker Studio Notebooks and Amazon EMR

Easily discover, connect to, create, terminate and manage Amazon EMR clusters in single account and cross account configurations directly from SageMaker Studio.

SageMaker Training Compiler

Train deep learning models faster on scalable GPU instances managed by SageMaker.