Use Notebook Instances to build models - Amazon SageMaker
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Use Notebook Instances to build models

One of the best ways for machine learning (ML) practitioners to use Amazon SageMaker is to train and deploy ML models using SageMaker notebook instances. The SageMaker notebook instances help create the environment by initiating Jupyter servers on Amazon Elastic Compute Cloud (Amazon EC2) and providing preconfigured kernels with the following packages: the Amazon SageMaker Python SDK, Amazon SDK for Python (Boto3), Amazon Command Line Interface (Amazon CLI), Conda, Pandas, deep learning framework libraries, and other libraries for data science and machine learning.

Machine Learning with the SageMaker Python SDK

To train, validate, deploy, and evaluate an ML model in a SageMaker notebook instance, use the SageMaker Python SDK. The SageMaker Python SDK abstracts Amazon SDK for Python (Boto3) and SageMaker API operations. It enables you to integrate with and orchestrate other Amazon services, such as Amazon Simple Storage Service (Amazon S3) for saving data and model artifacts, Amazon Elastic Container Registry (ECR) for importing and servicing the ML models, Amazon Elastic Compute Cloud (Amazon EC2) for training and inference.

You can also take advantage of SageMaker features that help you deal with every stage of a complete ML cycle: data labeling, data preprocessing, model training, model deployment, evaluation on prediction performance, and monitoring the quality of model in production.

If you're a first-time SageMaker user, we recommend you to use the SageMaker Python SDK, following the end-to-end ML tutorial. To find the open source documentation, see the Amazon SageMaker Python SDK.

Tutorial Overview

This Get Started tutorial walks you through how to create a SageMaker notebook instance, open a Jupyter notebook with a preconfigured kernel with the Conda environment for machine learning, and start a SageMaker session to run an end-to-end ML cycle. You'll learn how to save a dataset to a default Amazon S3 bucket automatically paired with the SageMaker session, submit a training job of an ML model to Amazon EC2, and deploy the trained model for prediction by hosting or batch inferencing through Amazon EC2.

This tutorial explicitly shows a complete ML flow of training the XGBoost model from the SageMaker built-in model pool. You use the US Adult Census dataset, and you evaluate the performance of the trained SageMaker XGBoost model on predicting individuals' income.