模型质量指标和 Amazon CloudWatch 监控 - Amazon SageMaker
Amazon Web Services 文档中描述的 Amazon Web Services 服务或功能可能因区域而异。要查看适用于中国区域的差异,请参阅 中国的 Amazon Web Services 服务入门 (PDF)

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

模型质量指标和 Amazon CloudWatch 监控

模型质量监控作业会计算不同的指标来评估机器学习模型的质量和性能。计算的具体指标取决于机器学习问题的类型:回归、二元分类或多类分类。监控这些指标对于检测模型随时间推移而发生的偏差至关重要。以下各节介绍了每种问题类型的关键模型质量指标,以及如何使用设置自动监控和警报 CloudWatch 来持续跟踪模型的性能。

注意

仅当至少有 200 个样本可用时,才会提供指标的标准差。Model Monitor 通过随机采样 80% 的数据五次、计算指标并对这些结果取标准差来计算标准差。

回归指标

以下是模型质量监控器针对回归问题计算的指标示例。

"regression_metrics" : { "mae" : { "value" : 0.3711832061068702, "standard_deviation" : 0.0037566388129940394 }, "mse" : { "value" : 0.3711832061068702, "standard_deviation" : 0.0037566388129940524 }, "rmse" : { "value" : 0.609248066149471, "standard_deviation" : 0.003079253267651125 }, "r2" : { "value" : -1.3766111872212665, "standard_deviation" : 0.022653980022771227 } }

二进制分类指标

以下是模型质量监控器针对二进制分类问题计算的指标示例。

"binary_classification_metrics" : { "confusion_matrix" : { "0" : { "0" : 1, "1" : 2 }, "1" : { "0" : 0, "1" : 1 } }, "recall" : { "value" : 1.0, "standard_deviation" : "NaN" }, "precision" : { "value" : 0.3333333333333333, "standard_deviation" : "NaN" }, "accuracy" : { "value" : 0.5, "standard_deviation" : "NaN" }, "recall_best_constant_classifier" : { "value" : 1.0, "standard_deviation" : "NaN" }, "precision_best_constant_classifier" : { "value" : 0.25, "standard_deviation" : "NaN" }, "accuracy_best_constant_classifier" : { "value" : 0.25, "standard_deviation" : "NaN" }, "true_positive_rate" : { "value" : 1.0, "standard_deviation" : "NaN" }, "true_negative_rate" : { "value" : 0.33333333333333337, "standard_deviation" : "NaN" }, "false_positive_rate" : { "value" : 0.6666666666666666, "standard_deviation" : "NaN" }, "false_negative_rate" : { "value" : 0.0, "standard_deviation" : "NaN" }, "receiver_operating_characteristic_curve" : { "false_positive_rates" : [ 0.0, 0.0, 0.0, 0.0, 0.0, 1.0 ], "true_positive_rates" : [ 0.0, 0.25, 0.5, 0.75, 1.0, 1.0 ] }, "precision_recall_curve" : { "precisions" : [ 1.0, 1.0, 1.0, 1.0, 1.0 ], "recalls" : [ 0.0, 0.25, 0.5, 0.75, 1.0 ] }, "auc" : { "value" : 1.0, "standard_deviation" : "NaN" }, "f0_5" : { "value" : 0.3846153846153846, "standard_deviation" : "NaN" }, "f1" : { "value" : 0.5, "standard_deviation" : "NaN" }, "f2" : { "value" : 0.7142857142857143, "standard_deviation" : "NaN" }, "f0_5_best_constant_classifier" : { "value" : 0.29411764705882354, "standard_deviation" : "NaN" }, "f1_best_constant_classifier" : { "value" : 0.4, "standard_deviation" : "NaN" }, "f2_best_constant_classifier" : { "value" : 0.625, "standard_deviation" : "NaN" } }

多类指标

以下是模型质量监控器针对多类别分类问题计算的指标示例。

"multiclass_classification_metrics" : { "confusion_matrix" : { "0" : { "0" : 1180, "1" : 510 }, "1" : { "0" : 268, "1" : 138 } }, "accuracy" : { "value" : 0.6288167938931297, "standard_deviation" : 0.00375663881299405 }, "weighted_recall" : { "value" : 0.6288167938931297, "standard_deviation" : 0.003756638812994008 }, "weighted_precision" : { "value" : 0.6983172269629505, "standard_deviation" : 0.006195912915307507 }, "weighted_f0_5" : { "value" : 0.6803947317178771, "standard_deviation" : 0.005328406973561699 }, "weighted_f1" : { "value" : 0.6571162346664904, "standard_deviation" : 0.004385008075019733 }, "weighted_f2" : { "value" : 0.6384024354394601, "standard_deviation" : 0.003867109755267757 }, "accuracy_best_constant_classifier" : { "value" : 0.19370229007633588, "standard_deviation" : 0.0032049848450732355 }, "weighted_recall_best_constant_classifier" : { "value" : 0.19370229007633588, "standard_deviation" : 0.0032049848450732355 }, "weighted_precision_best_constant_classifier" : { "value" : 0.03752057718081697, "standard_deviation" : 0.001241536088657851 }, "weighted_f0_5_best_constant_classifier" : { "value" : 0.04473443104152011, "standard_deviation" : 0.0014460485504284792 }, "weighted_f1_best_constant_classifier" : { "value" : 0.06286421244683643, "standard_deviation" : 0.0019113576884608862 }, "weighted_f2_best_constant_classifier" : { "value" : 0.10570313141262414, "standard_deviation" : 0.002734216826748117 } }

使用监控模型质量指标 CloudWatch

如果您在创建监控计划Trueenable_cloudwatch_metrics将的值设置为,则模型质量监控任务会将所有指标发送到 CloudWatch。

模型质量指标显示在以下命名空间中:

  • 对于实时端点:aws/sagemaker/Endpoints/model-metrics

  • 对于批量转换作业:aws/sagemaker/ModelMonitoring/model-metrics

有关发出的指标的列表,请参阅本页的前几节。

当特定 CloudWatch 指标未达到您指定的阈值时,您可以使用指标创建警报。有关如何创建 CloudWatch 警报的说明,请参阅《CloudWatch 用户指南》中的基于静态阈值创建 CloudWatch 警报。