本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。
亚马逊 SageMaker 模型卡
使用 Amazon M SageMaker odel Cards 在一个地方记录有关机器学习 (ML) 模型的关键细节,以简化管理和报告。
目录详细信息,例如模型的预期用途和风险评级、训练详细信息和指标、评估结果和观察结果,以及其他说明,例如注意事项、建议和自定义信息。通过创建模型卡,您可以执行以下操作:
-
提供有关如何使用模型的指导。
-
通过详细描述模型训练和性能来Support 审计活动。
-
传达模型是如何支持业务目标的。
模型卡提供有关记录哪些信息的规范性指导,并包括自定义信息字段。创建模型卡片后,您可以将其导出为 PDF 或下载以与相关利益相关者共享。除了对模型卡进行的批准状态更新外,任何其他编辑都会生成其他模型卡版本,以便拥有不可更改的模型变更记录。
主题
先决条件
要开始使用亚马逊 SageMaker 模型卡,您必须拥有创建、编辑、查看和导出模型卡的权限。
模型的预期用途
指定模型的预期用途有助于确保模型开发人员和用户获得负责任地训练或部署模型所需的信息。模型的预期用途应描述该模型适合在哪些场景中使用,以及不建议在哪些场景中使用该模型。
我们建议包括:
-
模型的一般用途
-
该模型所针对的用例
-
该模型本来不打算用于的用例
-
开发模型时做出的假设
模型的预期用途不仅限于技术细节,还描述了在生产中应如何使用模型、适合在哪些场景中使用模型,以及其他注意事项,例如用于模型的数据类型或开发期间做出的任何假设。
风险评级
开发人员为风险程度不同的用例创建 ML 模型。例如,批准贷款申请的模型可能比检测电子邮件类别的模型风险更高。鉴于模型的风险状况各不相同,模型卡片为您提供了一个字段来对模型的风险评级进行分类。
此风险评级可以是unknown
low
、medium
、或high
。使用这些风险评级字段来标记未知、低、中或高风险模型,并帮助您的组织遵守有关将某些模型投入生产的任何现行规则。
模型卡 JSON 架构
模型卡的评估详细信息必须以 JSON 格式提供。如果您有由 Clarify 或 SageMaker Model Monit or 生成的SageMaker JSON 格式评估报告,请将其上传到 Amazon S3 并提供 S3 URI 以自动解析评估指标。有关更多信息和示例报告,请参阅亚马逊 SageMaker 模型治理——模型卡示例笔记本中的示例指标
使用 SageMaker Python SDK 创建模型卡时,模型内容必须位于模型卡 JSON 架构中并以字符串形式提供。提供类似于以下示例的模型内容。
{ "$schema": "http://json-schema.org/draft-07/schema#", "$id": "http://json-schema.org/draft-07/schema#", "title": "SageMakerModelCardSchema", "description": "Default model card schema", "version": "0.1.0", "type": "object", "additionalProperties": false, "properties": { "model_overview": { "description": "Overview about the model", "type": "object", "additionalProperties": false, "properties": { "model_description": { "description": "description of model", "type": "string", "maxLength": 1024 }, "model_owner": { "description": "Owner of model", "type": "string", "maxLength": 1024 }, "model_creator": { "description": "Creator of model", "type": "string", "maxLength": 1024 }, "problem_type": { "description": "Problem being solved with the model", "type": "string" }, "algorithm_type": { "description": "Algorithm used to solve the problem", "type": "string", "maxLength": 1024 }, "model_id": { "description": "SageMaker Model ARN or non-SageMaker Model ID", "type": "string", "maxLength": 1024 }, "model_artifact": { "description": "Location of model artifacts", "type": "array", "maxContains": 15, "items": { "type": "string", "maxLength": 1024 } }, "model_name": { "description": "Name of the model", "type": "string", "maxLength": 1024 }, "model_version": { "description": "Version of the model", "type": "number", "minimum": 1 }, "inference_environment": { "description": "Overview about model inference", "type": "object", "additionalProperties": false, "properties": { "container_image": { "description": "SageMaker inference image URI", "type": "array", "maxContains": 15, "items": { "type": "string", "maxLength": 1024 } } } } } }, "intended_uses": { "description": "Intended usage of model", "type": "object", "additionalProperties": false, "properties": { "purpose_of_model": { "description": "General purpose of model", "type": "string", "maxLength": 2048 }, "intended_uses": { "description": "Intended use cases", "type": "string", "maxLength": 2048 }, "factors_affecting_model_efficiency": { "type": "string", "maxLength": 2048 }, "risk_rating": { "description": "Risk rating for model card", "$ref": "#/definitions/risk_rating" }, "explanations_for_risk_rating": { "type": "string", "maxLength": 2048 } } }, "training_details": { "description": "Overview about model training", "type": "object", "additionalProperties": false, "properties": { "objective_function": { "description": "The objective function that the model optimizes", "function": { "$ref": "#definitions/objective_function" }, "notes": { "type": "string", "maxLength": 1024 } }, "training_observations": { "type": "string", "maxLength": 1024 }, "training_job_details": { "type": "object", "additionalProperties": false, "properties": { "training_arn": { "description": "SageMaker training job ARN", "type": "string", "maxLength": 1024 }, "training_datasets": { "description": "Location of the model datasets", "type": "array", "maxContains": 15, "items": { "type": "string", "maxLength": 1024 } }, "training_environment": { "type": "object", "additionalProperties": false, "properties": { "container_image": { "description": "SageMaker training image URI", "type": "array", "maxContains": 15, "items": { "type": "string", "maxLength": 1024 } } } }, "training_metrics": { "type": "array", "items": { "maxItems": 50, "$ref": "#/definitions/training_metric" } }, "user_provided_training_metrics": { "type": "array", "items": { "maxItems": 50, "$ref": "#/definitions/training_metric" } } } } } }, "evaluation_details": { "type": "array", "default": [], "items": { "type": "object", "required": [ "name" ], "additionalProperties": false, "properties": { "name": { "type": "string", "pattern": ".{1,63}" }, "evaluation_observation": { "type": "string", "maxLength": 2096 }, "evaluation_job_arn": { "type": "string", "maxLength": 256 }, "datasets": { "type": "array", "items": { "type": "string", "maxLength": 1024 }, "maxItems": 10 }, "metadata": { "description": "Additional attributes associated with the evaluation results", "type": "object", "additionalProperties": { "type": "string", "maxLength": 1024 } }, "metric_groups": { "type": "array", "default": [], "items": { "type": "object", "required": [ "name", "metric_data" ], "properties": { "name": { "type": "string", "pattern": ".{1,63}" }, "metric_data": { "type": "array", "items": { "anyOf": [ { "$ref": "#/definitions/simple_metric" }, { "$ref": "#/definitions/linear_graph_metric" }, { "$ref": "#/definitions/bar_chart_metric" }, { "$ref": "#/definitions/matrix_metric" } ] } } } } } } } }, "additional_information": { "additionalProperties": false, "type": "object", "properties": { "ethical_considerations": { "description": "Any ethical considerations that the model card author wants to document", "type": "string", "maxLength": 2048 }, "caveats_and_recommendations": { "description": "Caveats and recommendations for those who might use this model in their applications.", "type": "string", "maxLength": 2048 }, "custom_details": { "type": "object", "additionalProperties": { "$ref": "#/definitions/custom_property" } } } } }, "definitions": { "risk_rating": { "description": "Your organization's risk rating of the model", "type": "string", "enum": [ "High", "Medium", "Low", "Unknown" ] }, "custom_property": { "description": "Additional property in section", "type": "string", "maxLength": 1024 }, "objective_function": { "description": "The objective function that the training job optimizes", "additionalProperties": false, "properties": { "function": { "type": "string", "enum": [ "Maximize", "Minimize" ] }, "facet": { "type": "string", "maxLength": 63 }, "condition": { "type": "string", "maxLength": 63 } } }, "training_metric": { "description": "Training metric data", "type": "object", "required": [ "name", "value" ], "additionalProperties": false, "properties": { "name": { "type": "string", "pattern": ".{1,255}" }, "notes": { "type": "string", "maxLength": 1024 }, "value": { "type": "number" } } }, "linear_graph_metric": { "type": "object", "required": [ "name", "type", "value" ], "additionalProperties": false, "properties": { "name": { "type": "string", "pattern": ".{1,255}" }, "notes": { "type": "string", "maxLength": 1024 }, "type": { "type": "string", "enum": [ "linear_graph" ] }, "value": { "anyOf": [ { "type": "array", "items": { "type": "array", "items": { "type": "number" }, "minItems": 2, "maxItems": 2 }, "minItems": 1, "maxItems": 20 } ] }, "x_axis_name": { "$ref": "#/definitions/axis_name_string" }, "y_axis_name": { "$ref": "#/definitions/axis_name_string" } } }, "bar_chart_metric": { "type": "object", "required": [ "name", "type", "value" ], "additionalProperties": false, "properties": { "name": { "type": "string", "pattern": ".{1,255}" }, "notes": { "type": "string", "maxLength": 1024 }, "type": { "type": "string", "enum": [ "bar_chart" ] }, "value": { "anyOf": [ { "type": "array", "items": { "type": "number" }, "minItems": 1, "maxItems": 20 } ] }, "x_axis_name": { "$ref": "#/definitions/axis_name_array" }, "y_axis_name": { "$ref": "#/definitions/axis_name_string" } } }, "matrix_metric": { "type": "object", "required": [ "name", "type", "value" ], "additionalProperties": false, "properties": { "name": { "type": "string", "pattern": ".{1,255}" }, "notes": { "type": "string", "maxLength": 1024 }, "type": { "type": "string", "enum": [ "matrix" ] }, "value": { "anyOf": [ { "type": "array", "items": { "type": "array", "items": { "type": "number" }, "minItems": 1, "maxItems": 20 }, "minItems": 1, "maxItems": 20 } ] }, "x_axis_name": { "$ref": "#/definitions/axis_name_array" }, "y_axis_name": { "$ref": "#/definitions/axis_name_array" } } }, "simple_metric": { "description": "metric data", "type": "object", "required": [ "name", "type", "value" ], "additionalProperties": false, "properties": { "name": { "type": "string", "pattern": ".{1,255}" }, "notes": { "type": "string", "maxLength": 1024 }, "type": { "type": "string", "enum": [ "number", "string", "boolean" ] }, "value": { "anyOf": [ { "type": "number" }, { "type": "string", "maxLength": 63 }, { "type": "boolean" } ] }, "x_axis_name": { "$ref": "#/definitions/axis_name_string" }, "y_axis_name": { "$ref": "#/definitions/axis_name_string" } } }, "axis_name_array": { "type": "array", "items": { "type": "string", "maxLength": 63 } }, "axis_name_string": { "type": "string", "maxLength": 63 } } }