Tune a Machine Learning Model - Amazon Step Functions
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China.

Tune a Machine Learning Model

This sample project demonstrates using SageMaker to tune the hyperparameters of a machine learning model, and to batch transform a test dataset. This sample project creates the following:

  • Three Amazon Lambda functions

  • An Amazon Simple Storage Service (Amazon S3) bucket

  • An Amazon Step Functions state machine

  • Related Amazon Identity and Access Management (IAM) roles

In this project, Step Functions uses a Lambda function to seed an Amazon S3 bucket with a test dataset. It then creates a hyperparameter tuning job using the SageMaker service integration. It then uses a Lambda function to extract the data path, saves the tuning model, extracts the model name, and then runs a batch transform job to perform inference in SageMaker.

For more information about SageMaker and Step Functions service integrations, see the following:

Note

This sample project may incur charges.

For new Amazon users, a free usage tier is available. On this tier, services are free below a certain level of usage. For more information about Amazon costs and the Free Tier, see SageMaker Pricing.

Create the State Machine and Provision Resources

  1. Open the Step Functions console and choose Create a state machine.

  2. Choose Sample Projects, and then choose Tune a machine learning model.

    The state machine Code and Visual Workflow are displayed.

    
                    Hyperparameter workflow.
  3. Choose Next.

    The Deploy resources page is displayed, listing the resources that will be created. For this sample project, the resources include:

    • Three Lambda functions

    • An Amazon S3 bucket

    • A Step Functions state machine

    • Related IAM roles

  4. Choose Deploy Resources.

    Note

    It can take up to 10 minutes for these resources and related IAM permissions to be created. While the Deploy resources page is displayed, you can open the Stack ID link to see which resources are being provisioned.

Start a New Execution

  1. Open the Step Functions console.

  2. On the State machines page, choose the HyperparamTuningAndBatchTransformStateMachine state machine that was created by the sample project and choose Start execution.

  3. On the New execution page, enter an execution name (optional), and then choose Start Execution.

  4. (Optional) To identify your execution, you can specify a name for it in the Name box. By default, Step Functions generates a unique execution name automatically.

    Note

    Step Functions allows you to create state machine, execution, and activity names that contain non-ASCII characters. These non-ASCII names don't work with Amazon CloudWatch. To ensure that you can track CloudWatch metrics, choose a name that uses only ASCII characters.

  5. (Optional) Go to the newly created state machine on the Step Functions Dashboard, and then choose New execution.

  6. When an execution is complete, you can select states on the Visual workflow and browse the Input and Output under Step details.

Example State Machine Code

The state machine in this sample project integrates with SageMaker and Amazon Lambda by passing parameters directly to those resources, and uses an Amazon S3 bucket for the training data source and output.

Browse through this example state machine to see how Step Functions controls Lambda and SageMaker.

For more information about how Amazon Step Functions can control other Amazon services, see Using Amazon Step Functions with other services.

{ "StartAt": "Generate Training Dataset", "States": { "Generate Training Dataset": { "Resource": "arn:aws:lambda:us-west-2:012345678912:function:StepFunctionsSample-SageMa-LambdaForDataGeneration-1TF67BUE5A12U", "Type": "Task", "Next": "HyperparameterTuning (XGBoost)" }, "HyperparameterTuning (XGBoost)": { "Resource": "arn:aws:states:::sagemaker:createHyperParameterTuningJob.sync", "Parameters": { "HyperParameterTuningJobName.$": "$.body.jobName", "HyperParameterTuningJobConfig": { "Strategy": "Bayesian", "HyperParameterTuningJobObjective": { "Type": "Minimize", "MetricName": "validation:rmse" }, "ResourceLimits": { "MaxNumberOfTrainingJobs": 2, "MaxParallelTrainingJobs": 2 }, "ParameterRanges": { "ContinuousParameterRanges": [{ "Name": "alpha", "MinValue": "0", "MaxValue": "1000", "ScalingType": "Auto" }, { "Name": "gamma", "MinValue": "0", "MaxValue": "5", "ScalingType": "Auto" } ], "IntegerParameterRanges": [{ "Name": "max_delta_step", "MinValue": "0", "MaxValue": "10", "ScalingType": "Auto" }, { "Name": "max_depth", "MinValue": "0", "MaxValue": "10", "ScalingType": "Auto" } ] } }, "TrainingJobDefinition": { "AlgorithmSpecification": { "TrainingImage": "433757028032.dkr.ecr.us-west-2.amazonaws.com/xgboost:latest", "TrainingInputMode": "File" }, "OutputDataConfig": { "S3OutputPath": "s3://stepfunctionssample-sagemak-bucketformodelanddata-80fblmdlcs9f/models" }, "StoppingCondition": { "MaxRuntimeInSeconds": 86400 }, "ResourceConfig": { "InstanceCount": 1, "InstanceType": "ml.m4.xlarge", "VolumeSizeInGB": 30 }, "RoleArn": "arn:aws:iam::012345678912:role/StepFunctionsSample-SageM-SageMakerAPIExecutionRol-1MNH1VS5CGGOG", "InputDataConfig": [{ "DataSource": { "S3DataSource": { "S3DataDistributionType": "FullyReplicated", "S3DataType": "S3Prefix", "S3Uri": "s3://stepfunctionssample-sagemak-bucketformodelanddata-80fblmdlcs9f/csv/train.csv" } }, "ChannelName": "train", "ContentType": "text/csv" }, { "DataSource": { "S3DataSource": { "S3DataDistributionType": "FullyReplicated", "S3DataType": "S3Prefix", "S3Uri": "s3://stepfunctionssample-sagemak-bucketformodelanddata-80fblmdlcs9f/csv/validation.csv" } }, "ChannelName": "validation", "ContentType": "text/csv" }], "StaticHyperParameters": { "precision_dtype": "float32", "num_round": "2" } } }, "Type": "Task", "Next": "Extract Model Path" }, "Extract Model Path": { "Resource": "arn:aws:lambda:us-west-2:012345678912:function:StepFunctionsSample-SageM-LambdaToExtractModelPath-V0R37CVARUS9", "Type": "Task", "Next": "HyperparameterTuning - Save Model" }, "HyperparameterTuning - Save Model": { "Parameters": { "PrimaryContainer": { "Image": "433757028032.dkr.ecr.us-west-2.amazonaws.com/xgboost:latest", "Environment": {}, "ModelDataUrl.$": "$.body.modelDataUrl" }, "ExecutionRoleArn": "arn:aws:iam::012345678912:role/StepFunctionsSample-SageM-SageMakerAPIExecutionRol-1MNH1VS5CGGOG", "ModelName.$": "$.body.bestTrainingJobName" }, "Resource": "arn:aws:states:::sagemaker:createModel", "Type": "Task", "Next": "Extract Model Name" }, "Extract Model Name": { "Resource": "arn:aws:lambda:us-west-2:012345678912:function:StepFunctionsSample-SageM-LambdaToExtractModelName-8FUOB30SM5EM", "Type": "Task", "Next": "Batch transform" }, "Batch transform": { "Type": "Task", "Resource": "arn:aws:states:::sagemaker:createTransformJob.sync", "Parameters": { "ModelName.$": "$.body.jobName", "TransformInput": { "CompressionType": "None", "ContentType": "text/csv", "DataSource": { "S3DataSource": { "S3DataType": "S3Prefix", "S3Uri": "s3://stepfunctionssample-sagemak-bucketformodelanddata-80fblmdlcs9f/csv/test.csv" } } }, "TransformOutput": { "S3OutputPath": "s3://stepfunctionssample-sagemak-bucketformodelanddata-80fblmdlcs9f/output" }, "TransformResources": { "InstanceCount": 1, "InstanceType": "ml.m4.xlarge" }, "TransformJobName.$": "$.body.jobName" }, "End": true } } }

For information about how to configure IAM when using Step Functions with other Amazon services, see IAM Policies for integrated services.

IAM Examples

These example Amazon Identity and Access Management (IAM) policies generated by the sample project include the least privilege necessary to execute the state machine and related resources. We recommend that you include only those permissions that are necessary in your IAM policies.

The following IAM policy is attached to the state machine, and allows the state machine execution to access necessary SageMaker, Lambda, and Amazon S3 resources.

{ "Version": "2012-10-17", "Statement": [ { "Action": [ "sagemaker:CreateHyperParameterTuningJob", "sagemaker:DescribeHyperParameterTuningJob", "sagemaker:StopHyperParameterTuningJob", "sagemaker:ListTags", "sagemaker:CreateModel", "sagemaker:CreateTransformJob", "iam:PassRole" ], "Resource": "*", "Effect": "Allow" }, { "Action": [ "lambda:InvokeFunction" ], "Resource": [ "arn:aws:lambda:us-west-2:012345678912:function:StepFunctionsSample-SageMa-LambdaForDataGeneration-1TF67BUE5A12U", "arn:aws:lambda:us-west-2:012345678912:function:StepFunctionsSample-SageM-LambdaToExtractModelPath-V0R37CVARUS9", "arn:aws:lambda:us-west-2:012345678912:function:StepFunctionsSample-SageM-LambdaToExtractModelName-8FUOB30SM5EM" ], "Effect": "Allow" }, { "Action": [ "events:PutTargets", "events:PutRule", "events:DescribeRule" ], "Resource": [ "arn:aws:events:*:*:rule/StepFunctionsGetEventsForSageMakerTrainingJobsRule", "arn:aws:events:*:*:rule/StepFunctionsGetEventsForSageMakerTransformJobsRule", "arn:aws:events:*:*:rule/StepFunctionsGetEventsForSageMakerTuningJobsRule" ], "Effect": "Allow" } ] }

The following IAM policy is referenced in the TrainingJobDefinition and HyperparameterTuning fields of the HyperparameterTuning state.

{ "Version": "2012-10-17", "Statement": [ { "Action": [ "cloudwatch:PutMetricData", "logs:CreateLogStream", "logs:PutLogEvents", "logs:CreateLogGroup", "logs:DescribeLogStreams", "ecr:GetAuthorizationToken", "ecr:BatchCheckLayerAvailability", "ecr:GetDownloadUrlForLayer", "ecr:BatchGetImage", "sagemaker:DescribeHyperParameterTuningJob", "sagemaker:StopHyperParameterTuningJob", "sagemaker:ListTags" ], "Resource": "*", "Effect": "Allow" }, { "Action": [ "s3:GetObject", "s3:PutObject" ], "Resource": "arn:aws:s3:::stepfunctionssample-sagemak-bucketformodelanddata-80fblmdlcs9f/*", "Effect": "Allow" }, { "Action": [ "s3:ListBucket" ], "Resource": "arn:aws:s3:::stepfunctionssample-sagemak-bucketformodelanddata-80fblmdlcs9f", "Effect": "Allow" } ] }

The following IAM policy allows the Lambda function to seed the Amazon S3 bucket with sample data.

{ "Version": "2012-10-17", "Statement": [ { "Action": [ "s3:PutObject" ], "Resource": "arn:aws:s3:::stepfunctionssample-sagemak-bucketformodelanddata-80fblmdlcs9f/*", "Effect": "Allow" } ] }

For information about how to configure IAM when using Step Functions with other Amazon services, see IAM Policies for integrated services.