编译模型 (Amazon Command Line Interface) - Amazon SageMaker
Amazon Web Services 文档中描述的 Amazon Web Services 服务或功能可能因区域而异。要查看适用于中国区域的差异,请参阅 中国的 Amazon Web Services 服务入门 (PDF)

编译模型 (Amazon Command Line Interface)

此部分介绍如何使用 Amazon Command Line Interface (CLI) 管理 Amazon SageMaker Neo 机器学习模型的编译作业。您可以创建、描述、停止和列出编译作业。

  1. 创建编译作业

    通过 CreateCompilationJob API 操作,您可以指定数据输入格式、用于存储模型的 S3 存储桶、用于写入已编译模型的 S3 存储桶以及目标硬件设备或平台。

    下表演示了如何基于您的目标是设备还是平台来配置 CreateCompilationJob API。

    Device Example
    { "CompilationJobName": "neo-compilation-job-demo", "RoleArn": "arn:aws:iam::<your-account>:role/service-role/AmazonSageMaker-ExecutionRole-yyyymmddThhmmss", "InputConfig": { "S3Uri": "s3://<your-bucket>/sagemaker/neo-compilation-job-demo-data/train", "DataInputConfig": "{'data': [1,3,1024,1024]}", "Framework": "MXNET" }, "OutputConfig": { "S3OutputLocation": "s3://<your-bucket>/sagemaker/neo-compilation-job-demo-data/compile", # A target device specification example for a ml_c5 instance family "TargetDevice": "ml_c5" }, "StoppingCondition": { "MaxRuntimeInSeconds": 300 } }

    如果您使用 PyTorch 框架训练模型并且目标设备是 ml_* 目标,则可以选择指定用于 FrameworkVersion 字段的框架版本。

    { "CompilationJobName": "neo-compilation-job-demo", "RoleArn": "arn:aws:iam::<your-account>:role/service-role/AmazonSageMaker-ExecutionRole-yyyymmddThhmmss", "InputConfig": { "S3Uri": "s3://<your-bucket>/sagemaker/neo-compilation-job-demo-data/train", "DataInputConfig": "{'data': [1,3,1024,1024]}", "Framework": "PYTORCH", "FrameworkVersion": "1.6" }, "OutputConfig": { "S3OutputLocation": "s3://<your-bucket>/sagemaker/neo-compilation-job-demo-data/compile", # A target device specification example for a ml_c5 instance family "TargetDevice": "ml_c5", # When compiling for ml_* instances using PyTorch framework, use the "CompilerOptions" field in # OutputConfig to provide the correct data type ("dtype") of the model’s input. Default assumed is "float32" "CompilerOptions": "{'dtype': 'long'}" }, "StoppingCondition": { "MaxRuntimeInSeconds": 300 } }
    注意:
    • 如果您使用 PyTorch 2.0 或更高版本保存模型,则 DataInputConfig 字段为可选字段。SageMaker Neo 从您使用 PyTorch 创建的模型定义文件中获取输入配置。有关如何创建定义文件的更多信息,请参阅为 SageMaker Neo 保存模型下的PyTorch 部分。

    • 此 API 字段仅适用于 PyTorch。

    Platform Example
    { "CompilationJobName": "neo-test-compilation-job", "RoleArn": "arn:aws:iam::<your-account>:role/service-role/AmazonSageMaker-ExecutionRole-yyyymmddThhmmss", "InputConfig": { "S3Uri": "s3://<your-bucket>/sagemaker/neo-compilation-job-demo-data/train", "DataInputConfig": "{'data': [1,3,1024,1024]}", "Framework": "MXNET" }, "OutputConfig": { "S3OutputLocation": "s3://<your-bucket>/sagemaker/neo-compilation-job-demo-data/compile", # A target platform configuration example for a p3.2xlarge instance "TargetPlatform": { "Os": "LINUX", "Arch": "X86_64", "Accelerator": "NVIDIA" }, "CompilerOptions": "{'cuda-ver': '10.0', 'trt-ver': '6.0.1', 'gpu-code': 'sm_70'}" }, "StoppingCondition": { "MaxRuntimeInSeconds": 300 } }
    注意

    对于 OutputConfig API 操作,TargetDeviceTargetPlatform API 操作是互相排斥的。您必须从两个选项中选择一个。

    要根据框架查找 DataInputConfig 的 JSON 字符串示例,请参阅 Neo 期望的输入数据形状

    有关设置配置的更多信息,请参阅 SageMaker API 参考中的 InputConfigOutputConfigTargetPlatform API 操作。

  2. 配置 JSON 文件后,运行以下命令来创建编译作业:

    aws sagemaker create-compilation-job \ --cli-input-json file://job.json \ --region us-west-2 # You should get CompilationJobArn
  3. 通过运行以下命令描述编译作业:

    aws sagemaker describe-compilation-job \ --compilation-job-name $JOB_NM \ --region us-west-2
  4. 通过运行以下命令停止编译作业:

    aws sagemaker stop-compilation-job \ --compilation-job-name $JOB_NM \ --region us-west-2 # There is no output for compilation-job operation
  5. 通过运行以下命令列出编译作业:

    aws sagemaker list-compilation-jobs \ --region us-west-2