使用 Amazon EventBridge 自动执行 Amazon SageMaker - Amazon SageMaker
Amazon Web Services 文档中描述的 Amazon Web Services 服务或功能可能因区域而异。要查看适用于中国区域的差异,请参阅中国的 Amazon Web Services 服务入门

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

使用 Amazon EventBridge 自动执行 Amazon SageMaker

Amazon EventBridge 监控 Amazon SageMaker 中的状态更改事件。EventBridge 使您能够自动执行 SageMaker 并自动响应事件,例如训练作业状态更改或终端节点状态更改。SageMaker 的事件将近乎实时传输到 EventBridge。您可以编写简单规则来指示您关注的事件,并指示要在事件匹配规则时执行的自动化操作。有关如何创建规则的示例,请参阅。使用 Amazon EventBridge Cridge 安排管道.

可自动触发的操作包括:

  • 调用 Amazon Lambda 函数

  • 调用 Amazon EC2 Run Command

  • 将事件中继到 Amazon Kinesis Data Streams

  • 激活 Amazon Step Functions 状态机

  • 通知 Amazon SNS 主题或Amazon SMSqueue

训练作业状态更改

指示 SageMaker 训练作业的状态更改。

如果 TrainingJobStatus 的值为 Failed,则事件包含 FailureReason 字段,此字段描述训练作业为何失败。

{ "version": "0", "id": "844e2571-85d4-695f-b930-0153b71dcb42", "detail-type": "SageMaker Training Job State Change", "source": "aws.sagemaker", "account": "123456789012", "time": "2018-10-06T12:26:13Z", "region": "us-east-1", "resources": [ "arn:aws:sagemaker:us-east-1:123456789012:training-job/kmeans-1" ], "detail": { "TrainingJobName": "89c96cc8-dded-4739-afcc-6f1dc936701d", "TrainingJobArn": "arn:aws:sagemaker:us-east-1:123456789012:training-job/kmeans-1", "TrainingJobStatus": "Completed", "SecondaryStatus": "Completed", "HyperParameters": { "Hyper": "Parameters" }, "AlgorithmSpecification": { "TrainingImage": "TrainingImage", "TrainingInputMode": "TrainingInputMode" }, "RoleArn": "arn:aws:iam::123456789012:role/SMRole", "InputDataConfig": [ { "ChannelName": "Train", "DataSource": { "S3DataSource": { "S3DataType": "S3DataType", "S3Uri": "S3Uri", "S3DataDistributionType": "S3DataDistributionType" } }, "ContentType": "ContentType", "CompressionType": "CompressionType", "RecordWrapperType": "RecordWrapperType" } ], "OutputDataConfig": { "KmsKeyId": "KmsKeyId", "S3OutputPath": "S3OutputPath" }, "ResourceConfig": { "InstanceType": "InstanceType", "InstanceCount": 3, "VolumeSizeInGB": 20, "VolumeKmsKeyId": "VolumeKmsKeyId" }, "VpcConfig": { }, "StoppingCondition": { "MaxRuntimeInSeconds": 60 }, "CreationTime": "1583831889050", "TrainingStartTime": "1583831889050", "TrainingEndTime": "1583831889050", "LastModifiedTime": "1583831889050", "SecondaryStatusTransitions": [ ], "Tags": { } } }

超级参数优化作业状态更改

指示 SageMaker 超参数优化作业的状态更改。

{ "version": "0", "id": "844e2571-85d4-695f-b930-0153b71dcb42", "detail-type": "SageMaker HyperParameter Tuning Job State Change", "source": "aws.sagemaker", "account": "123456789012", "time": "2018-10-06T12:26:13Z", "region": "us-east-1", "resources": [ "arn:aws:sagemaker:us-east-1:123456789012:tuningJob/x" ], "detail": { "HyperParameterTuningJobName": "016bffd3-6d71-4d3a-9710-0a332b2759fc", "HyperParameterTuningJobArn": "arn:aws:sagemaker:us-east-1:123456789012:tuningJob/x", "TrainingJobDefinition": { "StaticHyperParameters": {}, "AlgorithmSpecification": { "TrainingImage": "trainingImageName", "TrainingInputMode": "inputModeFile", "MetricDefinitions": [ { "Name": "metricName", "Regex": "regex" } ] }, "RoleArn": "roleArn", "InputDataConfig": [ { "ChannelName": "channelName", "DataSource": { "S3DataSource": { "S3DataType": "s3DataType", "S3Uri": "s3Uri", "S3DataDistributionType": "s3DistributionType" } }, "ContentType": "contentType", "CompressionType": "gz", "RecordWrapperType": "RecordWrapper" } ], "VpcConfig": { "SecurityGroupIds": [ "securityGroupIds" ], "Subnets": [ "subnets" ] }, "OutputDataConfig": { "KmsKeyId": "kmsKeyId", "S3OutputPath": "s3OutputPath" }, "ResourceConfig": { "InstanceType": "instanceType", "InstanceCount": 10, "VolumeSizeInGB": 500, "VolumeKmsKeyId": "volumeKeyId" }, "StoppingCondition": { "MaxRuntimeInSeconds": 3600 } }, "HyperParameterTuningJobStatus": "status", "CreationTime": "1583831889050", "LastModifiedTime": "1583831889050", "TrainingJobStatusCounters": { "Completed": 1, "InProgress": 0, "RetryableError": 0, "NonRetryableError": 0, "Stopped": 0 }, "ObjectiveStatusCounters": { "Succeeded": 1, "Pending": 0, "Failed": 0 }, "Tags": {} } }

转换作业状态更改

指示 SageMaker 批次转换作业的状态更改。

如果 TransformJobStatus 的值为 Failed,则事件包含 FailureReason 字段,此字段描述训练作业为何失败。

{ "version": "0", "id": "844e2571-85d4-695f-b930-0153b71dcb42", "detail-type": "SageMaker Transform Job State Change", "source": "aws.sagemaker", "account": "123456789012", "time": "2018-10-06T12:26:13Z", "region": "us-east-1", "resources": ["arn:aws:sagemaker:us-east-1:123456789012:transform-job/myjob"], "detail": { "TransformJobName": "4b52bd8f-e034-4345-818d-884bdd7c9724", "TransformJobArn": "arn:aws:sagemaker:us-east-1:123456789012:transform-job/myjob", "TransformJobStatus": "another status... GO", "FailureReason": "failed why 1", "ModelName": "i am a beautiful model", "MaxConcurrentTransforms": 5, "MaxPayloadInMB": 10, "BatchStrategy": "Strategizing...", "Environment": { "environment1": "environment2" }, "TransformInput": { "DataSource": { "S3DataSource": { "S3DataType": "s3DataType", "S3Uri": "s3Uri" } }, "ContentType": "content type", "CompressionType": "compression type", "SplitType": "split type" }, "TransformOutput": { "S3OutputPath": "s3Uri", "Accept": "accept", "AssembleWith": "assemblyType", "KmsKeyId": "kmsKeyId" }, "TransformResources": { "InstanceType": "instanceType", "InstanceCount": 3 }, "CreationTime": "2018-10-06T12:26:13Z", "TransformStartTime": "2018-10-06T12:26:13Z", "TransformEndTime": "2018-10-06T12:26:13Z", "Tags": {} } }

终端节点状态更改

指示 SageMaker 托管的实时推理终端节点的状态更改。

下面显示了具有终端节点的事件IN_SERVICE状态。

{ "version": "0", "id": "d2921b5a-b0ad-cace-a8e3-0f159d018e06", "detail-type": "SageMaker Endpoint State Change", "source": "aws.sagemaker", "account": "123456789012", "time": "1583831889050", "region": "us-west-2", "resources": [ "arn:aws:sagemaker:us-west-2:123456789012:endpoint/myendpoint" ], "detail": { "EndpointName": "MyEndpoint", "EndpointArn": "arn:aws:sagemaker:us-west-2:123456789012:endpoint/myendpoint", "EndpointConfigName": "MyEndpointConfig", "ProductionVariants": [ { "DesiredWeight": 1.0, "DesiredInstanceCount": 1.0 } ], "EndpointStatus": "IN_SERVICE", "CreationTime": 1592411992203.0, "LastModifiedTime": 1592411994287.0, "Tags": { } } }

功能组状态更改

表示FeatureGroupStatusOfflineStoreStatus的 SageMaker 功能组。

{ "version": "0", "id": "93201303-abdb-36a4-1b9b-4c1c3e3671c0", "detail-type": "SageMaker Feature Group State Change", "source": "aws.sagemaker", "account": "123456789012", "time": "2021-01-26T01:22:01Z", "region": "us-east-1", "resources": [ "arn:aws:sagemaker:us-east-1:123456789012:feature-group/sample-feature-group" ], "detail": { "FeatureGroupArn": "arn:aws:sagemaker:us-east-1:123456789012:feature-group/sample-feature-group", "FeatureGroupName": "sample-feature-group", "RecordIdentifierFeatureName": "RecordIdentifier", "EventTimeFeatureName": "EventTime", "FeatureDefinitions": [ { "FeatureName": "RecordIdentifier", "FeatureType": "Integral" }, { "FeatureName": "EventTime", "FeatureType": "Fractional" } ], "CreationTime": 1611624059000, "OnlineStoreConfig": { "EnableOnlineStore": true }, "OfflineStoreConfig": { "S3StorageConfig": { "S3Uri": "s3://offline/s3/uri" }, "DisableGlueTableCreation": false, "DataCatalogConfig": { "TableName": "sample-feature-group-1611624059", "Catalog": "AwsDataCatalog", "Database": "sagemaker_featurestore" } }, "RoleArn": "arn:aws:iam::123456789012:role/SageMakerRole", "FeatureGroupStatus": "Active", "Tags": {} } }

模型包状态更改

指示 SageMaker 模型包的状态更改。

{ "version": "0", "id": "844e2571-85d4-695f-b930-0153b71dcb42", "detail-type": "SageMaker Model Package State Change", "source": "aws.sagemaker", "account": "123456789012", "time": "2021-02-24T17:00:14Z", "region": "us-east-2", "resources": [ "arn:aws:sagemaker:us-east-2:123456789012:model-package/versionedmp-p-idy6c3e1fiqj/2" ], "source": [ "aws.sagemaker" ], "detail": { "ModelPackageGroupName": "versionedmp-p-idy6c3e1fiqj", "ModelPackageVersion": 2, "ModelPackageArn": "arn:aws:sagemaker:us-east-2:123456789012:model-package/versionedmp-p-idy6c3e1fiqj/2", "CreationTime": "2021-02-24T17:00:14Z", "InferenceSpecification": { "Containers": [ { "Image": "257758044811.dkr.ecr.us-east-2.amazonaws.com/sagemaker-xgboost:1.0-1-cpu-py3", "ImageDigest": "sha256:4dc8a7e4a010a19bb9e0a6b063f355393f6e623603361bd8b105f554d4f0c004", "ModelDataUrl": "s3://sagemaker-project-p-idy6c3e1fiqj/versionedmp-p-idy6c3e1fiqj/AbaloneTrain/pipelines-4r83jejmhorv-TrainAbaloneModel-xw869y8C4a/output/model.tar.gz" } ], "SupportedContentTypes": [ "text/csv" ], "SupportedResponseMIMETypes": [ "text/csv" ] }, "ModelPackageStatus": "Completed", "ModelPackageStatusDetails": { "ValidationStatuses": [], "ImageScanStatuses": [] }, "CertifyForMarketplace": false, "ModelApprovalStatus": "Rejected", "MetadataProperties": { "GeneratedBy": "arn:aws:sagemaker:us-east-2:123456789012:pipeline/versionedmp-p-idy6c3e1fiqj/execution/4r83jejmhorv" }, "ModelMetrics": { "ModelQuality": { "Statistics": { "ContentType": "application/json", "S3Uri": "s3://sagemaker-project-p-idy6c3e1fiqj/versionedmp-p-idy6c3e1fiqj/script-2021-02-24-10-55-15-413/output/evaluation/evaluation.json" } } }, "LastModifiedTime": "2021-02-24T17:00:14Z" } }

管道执行状态更改

指示 SageMaker 管道执行的状态更改。

{ "version": "0", "id": "315c1398-40ff-a850-213b-158f73kd93ir", "detail-type": "SageMaker Model Building Pipeline Execution Status Change", "source": "aws.sagemaker", "account": "123456789012", "time": "2021-03-15T16:10:11Z", "region": "us-east-1", "resources": ["arn:aws:sagemaker:us-east-1:123456789012:pipeline/myPipeline-123", "arn:aws:sagemaker:us-east-1:123456789012:pipeline/myPipeline-123/execution/p4jn9xou8a8s"], "detail": { "pipelineExecutionDisplayName": "SomeDisplayName", "currentPipelineExecutionStatus": "Succeeded", "previousPipelineExecutionStatus": "Executing", "executionStartTime": "2021-03-15T16:03:13Z", "executionEndTime": "2021-03-15T16:10:10Z", "pipelineExecutionDescription": "SomeDescription", "pipelineArn": "arn:aws:sagemaker:us-east-1:123456789012:pipeline/myPipeline-123", "pipelineExecutionArn": "arn:aws:sagemaker:us-east-1:123456789012:pipeline/myPipeline-123/execution/p4jn9xou8a8s" } }

管道步骤状态更改

指示 SageMaker 管线步骤的状态更改。

如果存在缓存命中,则该事件将包含cacheHitResult字段中返回的子位置类型。

如果currentStepStatusFailed,则该事件包含failureReason字段,该字段描述步骤失败的原因。

{ "version": "0", "id": "ea37ccbb-5e2b-05e9-4073-1daazc940304", "detail-type": "SageMaker Model Building Pipeline Execution Step Status Change", "source": "aws.sagemaker", "account": "123456789012", "time": "2021-03-15T16:10:10Z", "region": "us-east-1", "resources": ["arn:aws:sagemaker:us-east-1:123456789012:pipeline/myPipeline-123", "arn:aws:sagemaker:us-east-1:123456789012:pipeline/myPipeline-123/execution/p4jn9xou8a8s"], "detail": { "metadata": { "processingJob": { "arn": "arn:aws:sagemaker:us-east-1:123456789012:processing-job/pipelines-p4jn9xou8a8s-myprocessingstep1-tmgxry49ug" } }, "stepStartTime": "2021-03-15T16:03:14Z", "stepEndTime": "2021-03-15T16:10:09Z", "stepName": "myprocessingstep1", "stepType": "Processing", "previousStepStatus": "Executing", "currentStepStatus": "Succeeded", "pipelineArn": "arn:aws:sagemaker:us-east-1:123456789012:pipeline/myPipeline-123", "pipelineExecutionArn": "arn:aws:sagemaker:us-east-1:123456789012:pipeline/myPipeline-123/execution/p4jn9xou8a8s" } }

SageMaker 图像状态更改

指示 SageMaker 映像的状态更改。

{ "version": "0", "id": "cee033a3-17d8-49f8-865f-b9ebf485d9ee", "detail-type": "SageMaker Image State Change", "source": "aws.sagemaker", "account": "123456789012", "time": "2021-04-29T01:29:59Z", "region": "us-east-1", "resources": ["arn:aws:sagemaker:us-west-2:123456789012:image/cee033a3-17d8-49f8-865f-b9ebf485d9ee"], "detail": { "ImageName": "cee033a3-17d8-49f8-865f-b9ebf485d9ee", "ImageArn": "arn:aws:sagemaker:us-west-2:123456789012:image/cee033a3-17d8-49f8-865f-b9ebf485d9ee", "ImageStatus": "Creating", "Version": 1.0, "Tags": {} } }

SageMaker 映像版本状态更改

指示 SageMaker 映像版本的状态更改。

{ "version": "0", "id": "07fc4615-ebd7-15fc-1746-243411f09f04", "detail-type": "SageMaker Image Version State Change", "source": "aws.sagemaker", "account": "123456789012", "time": "2021-04-29T01:29:59Z", "region": "us-east-1", "resources": ["arn:aws:sagemaker:us-west-2:123456789012:image-version/07800032-2d29-48b7-8f82-5129225b2a85"], "detail": { "ImageArn": "arn:aws:sagemaker:us-west-2:123456789012:image/a70ff896-c832-4fe8-add6-eba25a0f43e6", "ImageVersionArn": "arn:aws:sagemaker:us-west-2:123456789012:image-version/07800032-2d29-48b7-8f82-5129225b2a85", "ImageVersionStatus": "Creating", "Version": 1.0, "Tags": {} } }

有关 SageMaker 作业、终端节点和管道的状态值及其含义的更多信息,请参阅以下链接:

有关更多信息,请参阅 Amazon EventBridge 用户指南