推理

本节介绍如何使用、和在适用于亚马逊弹性容器服务 (Amazon ECS) 的 Dee Amazon p Learning Containers PyTorch 上运行推理。 TensorFlow

重要

如果您的账户已经创建了 Amazon ECS 服务相关角色，则除非您在此处指定角色，否则默认情况下，该角色将用于您的服务。如果您的任务定义使用 awsvpc 网络模式，则需要服务相关角色。如果服务配置为使用服务发现、外部部署控制器、多个目标组或 Elastic Inference 加速器，则也需要该角色，在这种情况下，您不应在此处指定角色。有关更多信息，请参阅 Amazon ECS 开发人员指南中的使用适用于 Amazon ECS 的服务相关角色。

PyTorch 推断

必须先注册任务定义，然后才能在 Amazon ECS 集群上运行任务。任务定义是分组在一起的一系列容器。以下示例使用一个示例 Docker 镜像，该镜像将 CPU 或 GPU 推理脚本添加到 Deep Learning Containers 中。

后续步骤

要了解如何在 Amazon ECS 上使用带有 Deep Learning Containers 的自定义入口点，请参阅。自定义入口点

TensorFlow推断

以下示例使用一个示例 Docker 镜像，该镜像通过主机的命令行将 CPU 或 GPU 推理脚本添加到 Deep Learning Containers 中。

基于 CPU 的推理

使用以下示例运行基于 CPU 的推理。

使用以下内容创建名为 ecs-dlc-cpu-inference-taskdef.json 的文件。你可以将其与 TensorFlow 或 TensorFlow 2 一起使用。要将其与 TensorFlow 2 一起使用，请将 Docker 镜像更改为 TensorFlow 2 镜像，然后克隆 r2.0 服务存储库分支而不是 r1.15。


{
	"requiresCompatibilities": [
		"EC2"
	],
	"containerDefinitions": [{
		"command": [
			"mkdir -p /test && cd /test && git clone -b r1.15 https://github.com/tensorflow/serving.git && tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=saved_model_half_plus_two --model_base_path=/test/serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_cpu"
		],
		"entryPoint": [
			"sh",
			"-c"
		],
		"name": "tensorflow-inference-container",
		"image": "763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:1.15.0-cpu-py36-ubuntu18.04",
		"memory": 8111,
		"cpu": 256,
		"essential": true,
		"portMappings": [{
				"hostPort": 8500,
				"protocol": "tcp",
				"containerPort": 8500
			},
			{
				"hostPort": 8501,
				"protocol": "tcp",
				"containerPort": 8501
			},
			{
				"containerPort": 80,
				"protocol": "tcp"
			}
		],
		"logConfiguration": {
			"logDriver": "awslogs",
			"options": {
				"awslogs-group": "/ecs/tensorflow-inference-gpu",
				"awslogs-region": "us-east-1",
				"awslogs-stream-prefix": "half-plus-two",
				"awslogs-create-group": "true"
			}
		}
	}],
	"volumes": [],
	"networkMode": "bridge",
	"placementConstraints": [],
	"family": "tensorflow-inference"
}

注册任务定义。记下输出中的修订版号，并在下一步中使用它。


aws ecs register-task-definition --cli-input-json file://ecs-dlc-cpu-inference-taskdef.json

创建 Amazon ECS 服务指定任务定义时，请revision_id用上一步输出中任务定义的修订版号替换。


aws ecs create-service --cluster ecs-ec2-training-inference \
                       --service-name cli-ec2-inference-cpu \
                       --task-definition Ec2TFInference:revision_id \
                       --desired-count 1 \
                       --launch-type EC2 \
                       --scheduling-strategy="REPLICA" \
                       --region us-east-1

通过完成以下步骤来验证服务并获取网络终端节点。
1. 在 https://console.aws.amazon.com/ecs/v2 中打开控制台。
2. 选择 ecs-ec2-training-inference 集群。
3. 在 Cluster (集群) 页面上，选择 Services (服务)，然后选择 cli-ec2-inference-cpu。
4. 任务处于RUNNING状态后，选择任务标识符。
5. 在 “日志” 下，选择 “查看登录信息” CloudWatch。这会将您带到 CloudWatch 控制台以查看训练进度日志。
6. 在 Containers (容器) 下，展开容器详细信息。
7. 在 “名称” 和 “网络绑定” 下，在 “外部链接” 下记下端口 8501 的 IP 地址，并在下一步中使用该地址。
要运行推理，请使用以下命令。将外部 IP 地址替换为上一步中的外部链接 IP 地址。
```
curl -d '{"instances": [1.0, 2.0, 5.0]}' -X POST http://<External ip>:8501/v1/models/saved_model_half_plus_two:predict
```
下面是示例输出。
```
{
    "predictions": [2.5, 3.0, 4.5
    ]
}
```
重要
如果您无法连接到外部 IP 地址，请确保您的公司防火墙没有阻塞非标准端口，例如 8501。您可以尝试切换至来宾网络来验证。

基于 GPU 的推理

使用以下示例运行基于 GPU 的推理。

使用以下内容创建名为 ecs-dlc-gpu-inference-taskdef.json 的文件。你可以将其与 TensorFlow 或 TensorFlow 2 一起使用。要将其与 TensorFlow 2 一起使用，请将 Docker 镜像更改为 TensorFlow 2 镜像，然后克隆 r2.0 服务存储库分支而不是 r1.15。


{
	"requiresCompatibilities": [
		"EC2"
	],
	"containerDefinitions": [{
		"command": [
			"mkdir -p /test && cd /test && git clone -b r1.15 https://github.com/tensorflow/serving.git && tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=saved_model_half_plus_two --model_base_path=/test/serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_gpu"
		],
		"entryPoint": [
			"sh",
			"-c"
		],
		"name": "tensorflow-inference-container",
		"image": "763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:1.15.0-gpu-py36-cu100-ubuntu18.04",
		"memory": 8111,
		"cpu": 256,
		"resourceRequirements": [{
			"type": "GPU",
			"value": "1"
		}],
		"essential": true,
		"portMappings": [{
				"hostPort": 8500,
				"protocol": "tcp",
				"containerPort": 8500
			},
			{
				"hostPort": 8501,
				"protocol": "tcp",
				"containerPort": 8501
			},
			{
				"containerPort": 80,
				"protocol": "tcp"
			}
		],
		"logConfiguration": {
			"logDriver": "awslogs",
			"options": {
				"awslogs-group": "/ecs/TFInference",
				"awslogs-region": "us-east-1",
				"awslogs-stream-prefix": "ecs",
				"awslogs-create-group": "true"
			}
		}
	}],
	"volumes": [],
	"networkMode": "bridge",
	"placementConstraints": [],
	"family": "TensorFlowInference"
}

注册任务定义。记下输出中的修订版号，并在下一步中使用它。


aws ecs register-task-definition --cli-input-json file://ecs-dlc-gpu-inference-taskdef.json

创建 Amazon ECS 服务指定任务定义时，请revision_id用上一步输出中任务定义的修订版号替换。


aws ecs create-service --cluster ecs-ec2-training-inference \
                       --service-name cli-ec2-inference-gpu \
                       --task-definition Ec2TFInference:revision_id \
                       --desired-count 1 \
                       --launch-type EC2 \
                       --scheduling-strategy="REPLICA" \
                       --region us-east-1

通过完成以下步骤来验证服务并获取网络终端节点。
1. 在 https://console.aws.amazon.com/ecs/v2 中打开控制台。
2. 选择 ecs-ec2-training-inference 集群。
3. 在 Cluster (集群) 页面上，选择 Services (服务)，然后选择 cli-ec2-inference-cpu。
4. 任务处于RUNNING状态后，选择任务标识符。
5. 在 “日志” 下，选择 “查看登录信息” CloudWatch。这会将您带到 CloudWatch 控制台以查看训练进度日志。
6. 在 Containers (容器) 下，展开容器详细信息。
7. 在 “名称” 和 “网络绑定” 下，在 “外部链接” 下记下端口 8501 的 IP 地址，并在下一步中使用该地址。
要运行推理，请使用以下命令。将外部 IP 地址替换为上一步中的外部链接 IP 地址。
```
curl -d '{"instances": [1.0, 2.0, 5.0]}' -X POST http://<External ip>:8501/v1/models/saved_model_half_plus_two:predict
```
下面是示例输出。
```
{
    "predictions": [2.5, 3.0, 4.5
    ]
}
```
重要
如果您无法连接到外部 IP 地址，请确保您的公司防火墙没有阻塞非标准端口，例如 8501。您可以尝试切换至来宾网络来验证。

Javascript 在您的浏览器中被禁用或不可用。

要使用 Amazon Web Services 文档，必须启用 Javascript。请参阅浏览器的帮助页面以了解相关说明。

训练

自定义入口点