使用 Amazon SageMaker 实现 Amazon EventBridge 自动化 - Amazon SageMaker
AWS 文档中描述的 AWS 服务或功能可能因区域而异。要查看适用于中国区域的差异,请参阅中国的 AWS 服务入门

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

使用 Amazon SageMaker 实现 Amazon EventBridge 自动化

Amazon EventBridge 监控Amazon SageMaker标记、训练、超参数优化、进程作业、推理终端节点和功能组中的状态更改事件。 EventBridge 允许您自动自动化SageMaker并自动响应事件,例如训练作业或终端节点状态更改。中的事件将近实时SageMaker传输到 EventBridge 。您可以编写简单规则来指示您关注的事件,并指示要在事件匹配规则时执行的自动化操作。可自动触发的操作包括:

  • 调用 AWS Lambda 函数

  • 调用 Amazon EC2 Run Command

  • 将事件中继到 Amazon Kinesis Data Streams

  • 激活 AWS Step Functions 状态机

  • 通知 Amazon SNS 主题或 AWS SMS 队列

SageMaker 监控EventBridge的状态更改事件的一些示例包括:

  • SageMaker 训练作业状态更改

    指示SageMaker训练作业的状态更改。

    如果 TrainingJobStatus 的值为 Failed,则事件包含 FailureReason 字段,此字段描述训练作业为何失败。

    { "version": "0", "id": "844e2571-85d4-695f-b930-0153b71dcb42", "detail-type": "SageMaker Training Job State Change", "source": "aws.sagemaker", "account": "123456789012", "time": "2018-10-06T12:26:13Z", "region": "us-east-1", "resources": [ "arn:aws:sagemaker:us-east-1:123456789012:training-job/kmeans-1" ], "detail": { "TrainingJobName": "89c96cc8-dded-4739-afcc-6f1dc936701d", "TrainingJobArn": "arn:aws:sagemaker:us-east-1:123456789012:training-job/kmeans-1", "TrainingJobStatus": "Completed", "SecondaryStatus": "Completed", "HyperParameters": { "Hyper": "Parameters" }, "AlgorithmSpecification": { "TrainingImage": "TrainingImage", "TrainingInputMode": "TrainingInputMode" }, "RoleArn": "arn:aws:iam::123456789012:role/SMRole", "InputDataConfig": [ { "ChannelName": "Train", "DataSource": { "S3DataSource": { "S3DataType": "S3DataType", "S3Uri": "S3Uri", "S3DataDistributionType": "S3DataDistributionType" } }, "ContentType": "ContentType", "CompressionType": "CompressionType", "RecordWrapperType": "RecordWrapperType" } ], "OutputDataConfig": { "KmsKeyId": "KmsKeyId", "S3OutputPath": "S3OutputPath" }, "ResourceConfig": { "InstanceType": "InstanceType", "InstanceCount": 3, "VolumeSizeInGB": 20, "VolumeKmsKeyId": "VolumeKmsKeyId" }, "VpcConfig": { }, "StoppingCondition": { "MaxRuntimeInSeconds": 60 }, "CreationTime": "1583831889050", "TrainingStartTime": "1583831889050", "TrainingEndTime": "1583831889050", "LastModifiedTime": "1583831889050", "SecondaryStatusTransitions": [ ], "Tags": { } } }
  • SageMaker HyperParameter 优化作业状态更改

    指示SageMaker超参数优化作业的状态更改。

    { "version": "0", "id": "844e2571-85d4-695f-b930-0153b71dcb42", "detail-type": "SageMaker HyperParameter Tuning Job State Change", "source": "aws.sagemaker", "account": "123456789012", "time": "2018-10-06T12:26:13Z", "region": "us-east-1", "resources": [ "arn:aws:sagemaker:us-east-1:123456789012:tuningJob/x" ], "detail": { "HyperParameterTuningJobName": "016bffd3-6d71-4d3a-9710-0a332b2759fc", "HyperParameterTuningJobArn": "arn:aws:sagemaker:us-east-1:123456789012:tuningJob/x", "TrainingJobDefinition": { "StaticHyperParameters": {}, "AlgorithmSpecification": { "TrainingImage": "trainingImageName", "TrainingInputMode": "inputModeFile", "MetricDefinitions": [ { "Name": "metricName", "Regex": "regex" } ] }, "RoleArn": "roleArn", "InputDataConfig": [ { "ChannelName": "channelName", "DataSource": { "S3DataSource": { "S3DataType": "s3DataType", "S3Uri": "s3Uri", "S3DataDistributionType": "s3DistributionType" } }, "ContentType": "contentType", "CompressionType": "gz", "RecordWrapperType": "RecordWrapper" } ], "VpcConfig": { "SecurityGroupIds": [ "securityGroupIds" ], "Subnets": [ "subnets" ] }, "OutputDataConfig": { "KmsKeyId": "kmsKeyId", "S3OutputPath": "s3OutputPath" }, "ResourceConfig": { "InstanceType": "instanceType", "InstanceCount": 10, "VolumeSizeInGB": 500, "VolumeKmsKeyId": "volumeKeyId" }, "StoppingCondition": { "MaxRuntimeInSeconds": 3600 } }, "HyperParameterTuningJobStatus": "status", "CreationTime": "1583831889050", "LastModifiedTime": "1583831889050", "TrainingJobStatusCounters": { "Completed": 1, "InProgress": 0, "RetryableError": 0, "NonRetryableError": 0, "Stopped": 0 }, "ObjectiveStatusCounters": { "Succeeded": 1, "Pending": 0, "Failed": 0 }, "Tags": {} } }
  • SageMaker 转换作业状态更改

    指示SageMaker批量转换作业的状态更改。

    如果 TransformJobStatus 的值为 Failed,则事件包含 FailureReason 字段,此字段描述训练作业为何失败。

    { "version": "0", "id": "844e2571-85d4-695f-b930-0153b71dcb42", "detail-type": "SageMaker Transform Job State Change", "source": "aws.sagemaker", "account": "123456789012", "time": "1583831889050", "region": "us-east-1", "resources": [ "arn:aws:sagemaker:us-east-1:123456789012:transform-job/myjob" ], "detail": { "TransformJobName": "4b52bd8f-e034-4345-818d-884bdd7c9724", "TransformJobArn": "arn:aws:sagemaker:us-east-1:123456789012:transform-job/myjob", "TransformJobStatus": "Completed", "ModelName": "ModelName", "MaxConcurrentTransforms": 5, "MaxPayloadInMB": 10, "BatchStrategy": "Strategy", "Environment": { "environment1": "environment2" }, "TransformInput": { "DataSource": { "S3DataSource": { "S3DataType": "s3DataType", "S3Uri": "s3Uri" } }, "ContentType": "content type", "CompressionType": "compression type", "SplitType": "split type" }, "TransformOutput": { "S3OutputPath": "s3Uri", "Accept": "accept", "AssembleWith": "assemblyType", "KmsKeyId": "kmsKeyId" }, "TransformResources": { "InstanceType": "instanceType", "InstanceCount": 3 }, "CreationTime": "1583831889050", "TransformStartTime": "1583831889050", "TransformEndTime": "1583831889050", "Tags": { } } }
  • SageMaker HyperParameter 优化作业状态更改

    指示SageMaker超参数优化作业的状态更改。

    { "version": "0", "id": "844e2571-85d4-695f-b930-0153b71dcb42", "detail-type": "SageMaker HyperParameter Tuning Job State Change", "source": "aws.sagemaker", "account": "123456789012", "time": "1583831889050", "region": "us-east-1", "resources": [ "arn:aws:sagemaker:us-east-1:123456789012:tuningJob/x" ], "detail": { "HyperParameterTuningJobName": "016bffd3-6d71-4d3a-9710-0a332b2759fc", "HyperParameterTuningJobArn": "arn:aws:sagemaker:us-east-1:123456789012:tuningJob/x", "TrainingJobDefinition": { "StaticHyperParameters": {}, "AlgorithmSpecification": { "TrainingImage": "trainingImageName", "TrainingInputMode": "inputModeFile", "MetricDefinitions": [ { "Name": "metricName", "Regex": "regex" } ] }, "RoleArn": "roleArn", "InputDataConfig": [ { "ChannelName": "channelName", "DataSource": { "S3DataSource": { "S3DataType": "s3DataType", "S3Uri": "s3Uri", "S3DataDistributionType": "s3DistributionType" } }, "ContentType": "contentType", "CompressionType": "gz", "RecordWrapperType": "RecordWrapper" } ], "VpcConfig": { "SecurityGroupIds": [ "securityGroupIds" ], "Subnets": [ "subnets" ] }, "OutputDataConfig": { "KmsKeyId": "kmsKeyId", "S3OutputPath": "s3OutputPath" }, "ResourceConfig": { "InstanceType": "instanceType", "InstanceCount": 10, "VolumeSizeInGB": 500, "VolumeKmsKeyId": "volumeKeyId" }, "StoppingCondition": { "MaxRuntimeInSeconds": 3600 } }, "HyperParameterTuningJobStatus": "status", "CreationTime": "1583831889050", "LastModifiedTime": "1583831889050", "TrainingJobStatusCounters": { "Completed": 1, "InProgress": 0, "RetryableError": 0, "NonRetryableError": 0, "Stopped": 0 }, "ObjectiveStatusCounters": { "Succeeded": 1, "Pending": 0, "Failed": 0 }, "Tags": {} } }
  • SageMaker 终端节点状态更改

    指示SageMaker托管实时推理终端节点的状态更改。

    下面显示了终端节点处于 IN_SERVICE 状态的事件。

    { 'version': '0', 'id': 'd2921b5a-b0ad-cace-a8e3-0f159d018e06', 'detail-type': 'SageMakerEndpointStateChange', 'source': 'aws.sagemaker', 'account': '123456789012', 'time': '1583831889050', 'region': 'us-west-2', 'resources': [ 'arn:aws:sagemaker:us-west-2:123456789012:endpoint/myendpoint' ], 'detail': { 'EndpointName': 'MyEndpoint', 'EndpointArn': 'arn:aws:sagemaker:us-west-2:123456789012:endpoint/myendpoint', 'EndpointConfigName': 'MyEndpointConfig', 'ProductionVariants': [ { 'DesiredWeight': 1.0, 'DesiredInstanceCount': 1.0 } ], 'EndpointStatus': 'IN_SERVICE', 'CreationTime': 1592411992203.0, 'LastModifiedTime': 1592411994287.0, 'Tags': { } } }
  • SageMaker 功能组状态更改

    指示FeatureGroupStatus功能组 或 OfflineStoreStatus SageMaker 中的更改。

    { "version": "0", "id": "93201303-abdb-36a4-1b9b-4c1c3e3671c0", "detail-type": "SageMaker Feature Group State Change", "source": "aws.sagemaker", "account": "123456789012", "time": "2021-01-26T01:22:01Z", "region": "us-east-1", "resources": [ "arn:aws:sagemaker:us-east-1:123456789012:feature-group/sample-feature-group" ], "detail": { "FeatureGroupArn": "arn:aws:sagemaker:us-east-1:123456789012:feature-group/sample-feature-group", "FeatureGroupName": "sample-feature-group", "RecordIdentifierFeatureName": "RecordIdentifier", "EventTimeFeatureName": "EventTime", "FeatureDefinitions": [ { "FeatureName": "RecordIdentifier", "FeatureType": "Integral" }, { "FeatureName": "EventTime", "FeatureType": "Fractional" } ], "CreationTime": 1611624059000, "OnlineStoreConfig": { "EnableOnlineStore": true }, "OfflineStoreConfig": { "S3StorageConfig": { "S3Uri": "s3://offline/s3/uri" }, "DisableGlueTableCreation": false, "DataCatalogConfig": { "TableName": "sample-feature-group-1611624059", "Catalog": "AwsDataCatalog", "Database": "sagemaker_featurestore" } }, "RoleArn": "arn:aws:iam::123456789012:role/SageMakerRole", "FeatureGroupStatus": "Active", "Tags": {} } }

有关 SageMaker 作业和终端节点的状态值及其含义的更多信息,请参阅以下链接:

有关更多信息,请参阅 Amazon EventBridge 用户指南.