编译模型 (AWS Command Line Interface) - Amazon SageMaker
AWS 文档中描述的 AWS 服务或功能可能因区域而异。要查看适用于中国区域的差异,请参阅中国的 AWS 服务入门

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

编译模型 (AWS Command Line Interface)

本节介绍如何使用 Amazon SageMaker (CLI) 管理机器学习模型的 AWS Command Line Interface Neo 编译作业。您可以创建、描述、停止和列出编译作业。

  1. 创建编译作业

    使用 CreateCompilationJob API 操作,您可以指定数据输入格式、用于存储模型的 S3 存储桶、要将编译的模型写入到的 S3 存储桶以及目标硬件设备或平台。

    下表演示如何根据您的目标是设备还是平台来配置 CreateCompilationJob API。

    Device Example
    { "CompilationJobName": "neo-compilation-job-demo", "RoleArn": "arn:aws:iam::<your-account>:role/service-role/AmazonSageMaker-ExecutionRole-yyyymmddThhmmss", "InputConfig": { "S3Uri": "s3://<your-bucket>/sagemaker/neo-compilation-job-demo-data/train", "DataInputConfig": "{'data': [1,3,1024,1024]}", "Framework": "MXNET" }, "OutputConfig": { "S3OutputLocation": "s3://<your-bucket>/sagemaker/neo-compilation-job-demo-data/compile", # A target device specification example for a ml_c5 instance family "TargetDevice": "ml_c5" }, "StoppingCondition": { "MaxRuntimeInSeconds": 300 } }

    如果您使用 PyTorch 框架训练模型,并且目标设备是目标,则可以选择指定与 FrameworkVersionml_* 字段一起使用的框架版本。

    { "CompilationJobName": "neo-compilation-job-demo", "RoleArn": "arn:aws:iam::<your-account>:role/service-role/AmazonSageMaker-ExecutionRole-yyyymmddThhmmss", "InputConfig": { "S3Uri": "s3://<your-bucket>/sagemaker/neo-compilation-job-demo-data/train", "DataInputConfig": "{'data': [1,3,1024,1024]}", "Framework": "PYTORCH", # The FrameworkVersion field is only supported when compiling for PyTorch framework and ml_* targets, # excluding ml_inf. Supported values are 1.4 or 1.5 or 1.6 . Default is 1.6 "FrameworkVersion": "1.6" }, "OutputConfig": { "S3OutputLocation": "s3://<your-bucket>/sagemaker/neo-compilation-job-demo-data/compile", # A target device specification example for a ml_c5 instance family "TargetDevice": "ml_c5", # When compiling for ml_* instances using PyTorch framework, use the "CompilerOptions" field in # OutputConfig to provide the correct data type ("dtype") of the model’s input. Default assumed is "float32" "CompilerOptions": "{'dtype': 'long'}" }, "StoppingCondition": { "MaxRuntimeInSeconds": 300 } }
    注意

    仅 PyTorch 支持此 API 字段。

    Platform Example
    { "CompilationJobName": "neo-test-compilation-job", "RoleArn": "arn:aws:iam::<your-account>:role/service-role/AmazonSageMaker-ExecutionRole-yyyymmddThhmmss", "InputConfig": { "S3Uri": "s3://<your-bucket>/sagemaker/neo-compilation-job-demo-data/train", "DataInputConfig": "{'data': [1,3,1024,1024]}", "Framework": "MXNET" }, "OutputConfig": { "S3OutputLocation": "s3://<your-bucket>/sagemaker/neo-compilation-job-demo-data/compile", # A target platform configuration example for a p3.2xlarge instance "TargetPlatform": { "Os": "LINUX", "Arch": "X86_64", "Accelerator": "NVIDIA" }, "CompilerOptions": "{'cuda-ver': '10.0', 'trt-ver': '6.0.1', 'gpu-code': 'sm_70'}" }, "StoppingCondition": { "MaxRuntimeInSeconds": 300 } }
    注意

    对于 OutputConfig API 操作, TargetDeviceTargetPlatform API 操作相互排斥。您必须选择两个选项之一。

    要根据DataInputConfig框架查找 的 JSON 字符串示例,请参阅 Neo 预期的https://docs.amazonaws.cn/sagemaker/latest/dg/neo-troubleshooting-compilation.html#neo-troubleshooting-errors-preventing输入数据形状。

    有关设置配置的更多信息,请参阅 API 参考中的 InputConfigOutputConfigTargetPlatform SageMaker API 操作。

  2. 配置 JSON 文件后,运行以下命令以创建编译作业:

    aws sagemaker create-compilation-job \ --cli-input-json file://job.json \ --region us-west-2 # You should get CompilationJobArn
  3. 通过运行以下命令来描述编译作业:

    aws sagemaker describe-compilation-job \ --compilation-job-name $JOB_NM \ --region us-west-2
  4. 通过运行以下命令停止编译作业:

    aws sagemaker stop-compilation-job \ --compilation-job-name $JOB_NM \ --region us-west-2 # There is no output for compilation-job operation
  5. 通过运行以下命令列出编译作业:

    aws sagemaker list-compilation-jobs \ --region us-west-2