Amazon Elastic Compute Cloud
用户指南(适用于 Linux 实例)
AWS 文档中描述的 AWS 服务或功能可能因区域而异。要查看适用于中国区域的差异,请参阅 Amazon AWS 入门

将 TensorFlow 模型与 Amazon EI 结合使用

启用了 Amazon EI 的 TensorFlow 和 TensorFlow Serving 版本可让您使用 Amazon EI 加速器,只需对 TensorFlow 代码略微进行修改。AWS Deep Learning AMI 中提供了启用了 Amazon EI 的软件包。您还可从 Amazon S3 存储桶下载软件包,以将其构建到自己的 Amazon Linux 或 Ubuntu AMI 或者 Docker 容器中。

使用 Amazon EI TensorFlow Serving,标准 TensorFlow Serving 推理保持不变。唯一的区别是入口点是一个名为 amazonei_tensorflow_model_server 的不同的二进制文件。

有关更多信息,请参阅 TensorFlow Serving

Python 2 和 3 的 Amazon EI TensorFlow 包提供 EIPredictor API。此 API 函数为您提供了一种在 EI 上运行模型的灵活方法,作为使用 TensorFlow Serving 的替代方法。

Amazon EI TensorFlow Serving 的此发布版本已在下列深度学习使用案例和网络架构(及类似变体)上通过测试,可以良好运行,具有节省成本的优点:

使用案例 示例网络拓扑

图像识别

Inception、ResNet、MVCNN

SSD、RCNN

神经机器翻译

GNMT

Amazon EI TensorFlow Serving 示例

以下是您可尝试使用 Single Shot Detector (SSD) 为不同的模型(如 ResNet)提供服务的示例。作为一般规则,您需要将可维护模型和客户端脚本下载到您的 DLAMI。

激活 TensorFlow Elastic Inference 环境

  • 如果您使用 AWS Deep Learning AMI,请激活 Python 2.7 TensorFlow 环境。该示例脚本不与 Python 3.x 兼容

    source activate amazonei_tensorflow_p27

使用 Inception 模型处理和测试推理

  1. 下载该模型。

    curl -O https://s3-us-west-2.amazonaws.com/aws-tf-serving-ei-example/ssd_resnet.zip
  2. 解压缩模型。

    unzip ssd_resnet.zip -d /tmp
  3. 将一张有三只狗的照片下载到您的主目录中。

    curl -O https://raw.githubusercontent.com/awslabs/mxnet-model-server/master/docs/images/3dogs.jpg
  4. 导航到安装 AmazonEI_TensorFlow_Serving 的文件夹并运行以下命令来启动服务器。注意,“model_base_path”必须为绝对路径。

    AmazonEI_TensorFlow_Serving_v1.12_v1 --model_name=ssdresnet --model_base_path=/tmp/ssd_resnet50_v1_coco --port=9000
  5. 当服务器仍在前台运行时,启动另一个终端会话。打开新的终端并激活 TensorFlow 环境。

    source activate amazonei_tensorflow_p27
  6. 使用您的首选文本编辑器来创建具有以下内容的脚本。将其命名为 ssd_resnet_client.py。此脚本将映像文件名用作参数,并从预训练模型中获得预测结果。

    from __future__ import print_function import grpc import tensorflow as tf from PIL import Image import numpy as np import time import os from tensorflow_serving.apis import predict_pb2 from tensorflow_serving.apis import prediction_service_pb2_grpc tf.app.flags.DEFINE_string('server', 'localhost:9000', 'PredictionService host:port') tf.app.flags.DEFINE_string('image', '', 'path to image in JPEG format') FLAGS = tf.app.flags.FLAGS if(FLAGS.image == ''): print("Supply an Image using '--image [path/to/image]'") exit(1) coco_classes_txt = "https://raw.githubusercontent.com/amikelive/coco-labels/master/coco-labels-paper.txt" local_coco_classes_txt = "/tmp/coco-labels-paper.txt" # Downloading coco labels os.system("curl -o %s -O %s" % (local_coco_classes_txt, coco_classes_txt)) # Setting default number of predictions NUM_PREDICTIONS = 20 # Reading coco labels to a list with open(local_coco_classes_txt) as f: classes = ["No Class"] + [line.strip() for line in f.readlines()] def main(_): channel = grpc.insecure_channel(FLAGS.server) stub = prediction_service_pb2_grpc.PredictionServiceStub(channel) with Image.open(FLAGS.image) as f: f.load() # Reading the test image given by the user data = np.asarray(f) # Setting batch size to 1 data = np.expand_dims(data, axis=0) # Creating a prediction request request = predict_pb2.PredictRequest() # Setting the model spec name request.model_spec.name = 'ssdresnet' # Setting up the inputs and tensors from image data request.inputs['inputs'].CopyFrom( tf.contrib.util.make_tensor_proto(data, shape=data.shape)) # Iterating over the predictions. The first inference request can take saveral seconds to complete for curpred in range(NUM_PREDICTIONS): if(curpred == 0): print("The first inference request loads the model into the accelerator and can take several seconds to complete. Please standby!") # Start the timer start = time.time() # This is where the inference actually happens result = stub.Predict(request, 60.0) # 10 secs timeout print("Inference %d took %f seconds" % (curpred, time.time()-start)) # Extracting results from output outputs = result.outputs detection_classes = outputs["detection_classes"] # Creating an ndarray from the output TensorProto detection_classes = tf.make_ndarray(detection_classes) # Getting the number of objects detected in the input image from the output of the predictor num_detections = int(tf.make_ndarray(outputs["num_detections"])[0]) print("%d detection[s]" % (num_detections)) # Getting the class ids from the output and mapping the class ids to class names from the coco labels class_label = [classes[int(x)] for x in detection_classes[0][:num_detections]] print("SSD Prediction is ", class_label) if __name__ == '__main__': tf.app.run()
  7. 现在,运行将服务器位置、端口以及狗照片的文件名作为参数传递的脚本。

    python ssd_resnet_client.py --server=localhost:9000 --image 3dogs.jpg

Amazon EI TensorFlow Predictor

EIPredictor API 提供了一个简单的界面,可以对预训练模型执行重复推理。以下代码示例显示可用的参数。

ei_predictor = EIPredictor(model_dir, signature_def_key=None, signature_def=None, input_names=None, output_names=None, tags=None, graph=None, config=None, use_ei=True) output_dict = ei_predictor(feed_dict)

因此,对于保存的模型,EIPredictor 的使用类似于 TensorFlow Predictor。可以通过以下方式使用 EIPredictor:

//EIPredictor class picks inputs and outputs from default serving signature def with tag “serve”. (similar to TF predictor) ei_predictor = EIPredictor(model_dir) //EI Predictor class picks inputs and outputs from the signature def picked using the signtaure_def_key (similar to TF predictor) ei_predictor = EIPredictor(model_dir, signature_def_key='predict') // Signature_def can be provided directly (similar to TF predictor) ei_predictor = EIPredictor(model_dir, signature_def= sig_def) // You provide the input_names and output_names dict. // similar to TF predictor ei_predictor = EIPredictor(model_dir, input_names, output_names) // tag is used to get the correct signature def. (similar to TF predictor) ei_predictor = EIPredictor(model_dir, tags='serve')

其他 EI Predictor 功能包括:

  • 支持冻结模型。

    // For Frozen graphs, model_dir takes a file name , input_names and output_names // input_names and output_names should be provided in this case. ei_predictor = EIPredictor(model_dir, input_names=None, output_names=None )
  • 能够通过使用 use_ei 标志禁用 EI,该标志默认为 True。这对于针对 TensorFlow Predictor 测试 EIPredictor 非常有用。

  • 能够从 TensorFlow Estimator 创建 EIPredictor。给定训练的 Estimator,您可以先导出 SavedModel。有关更多详细信息,请参阅 SavedModel 文档。以下是用法示例:

    saved_model_dir = estimator.export_savedmodel(my_export_dir, serving_input_fn) ei_predictor = EIPredictor(export_dir=saved_model_dir) // Once the EIPredictor is created, inference is done using the following: output_dict = ei_predictor(feed_dict)

Amazon EI TensorFlow Predictor 示例

安装 Amazon EI TensorFlow

启用了 EI 的 TensorFlow 捆绑在 Deep Learning AMI 中。您还可以从 Amazon EI S3 存储桶下载 Python 2 和 3 的 pip wheel。请按照这些说明下载并安装 pip 程序包:

S3 存储痛中选择适用于您所选 Python 版本和操作系统的 pip wheel。将路径复制到 pip wheel 并运行以下命令:

curl -O [URL of the pip wheel of your choice]

安装 pip wheel 的步骤:

pip install [path to downloaded wheel]

尝试以下示例以使用 Single Shot Detector (SSD) 服务不同的模型(如 ResNet)。作为一般规则,您需要将可维护模型和客户端脚本下载到您的 Deep Learning AMI (DLAMI),然后再继续。

使用 SSD 模型处理和测试推理

  1. 下载该模型。如果您已在 Serving 示例中下载了该模型,请跳过此步骤。

    curl -O https://s3-us-west-2.amazonaws.com/aws-tf-serving-ei-example/ssd_resnet.zip
  2. 解压缩模型。再次提醒,如果您已有模型,可跳过此步骤。

    unzip ssd_resnet.zip -d /tmp
  3. 将一张有三只狗的照片下载到您的当前目录中。

    curl -O https://raw.githubusercontent.com/awslabs/mxnet-model-server/master/docs/images/3dogs.jpg
  4. 打开文本编辑器(如 vim)并粘贴以下推理脚本。将该文件保存为 ssd_resnet_predictor.py

    from __future__ import absolute_import from __future__ import division from __future__ import print_function import os import sys import numpy as np import tensorflow as tf import matplotlib.image as mpimg import time from tensorflow.contrib.ei.python.predictor.ei_predictor import EIPredictor tf.app.flags.DEFINE_string('image', '', 'path to image in JPEG format') FLAGS = tf.app.flags.FLAGS if(FLAGS.image == ''): print("Supply an Image using '--image [path/to/image]'") exit(1) coco_classes_txt = "https://raw.githubusercontent.com/amikelive/coco-labels/master/coco-labels-paper.txt" local_coco_classes_txt = "/tmp/coco-labels-paper.txt" # Downloading coco labels os.system("curl -o %s -O %s" % (local_coco_classes_txt, coco_classes_txt)) # Setting default number of predictions NUM_PREDICTIONS = 20 # Reading coco labels to a list with open(local_coco_classes_txt) as f: classes = ["No Class"] + [line.strip() for line in f.readlines()] def main(_): # Reading the test image given by the user img = mpimg.imread(FLAGS.image) # Setting batch size to 1 img = np.expand_dims(img, axis=0) # Setting up EIPredictor Input ssd_resnet_input = {'inputs': img} print('Running SSD Resnet on EIPredictor using specified input and outputs') # This is the EIPredictor interface, using specified input and outputs eia_predictor = EIPredictor( # Model directory where the saved model is located model_dir='/tmp/ssd_resnet50_v1_coco/1/', # Specifying the inputs to the Predictor input_names={"inputs": "image_tensor:0"}, # Specifying the output names to tensor for Predictor output_names={"detection_classes": "detection_classes:0", "num_detections": "num_detections:0", "detection_boxes": "detection_boxes:0"}, ) pred = None # Iterating over the predictions. The first inference request can take saveral seconds to complete for curpred in range(NUM_PREDICTIONS): if(curpred == 0): print("The first inference request loads the model into the accelerator and can take several seconds to complete. Please standby!") # Start the timer start = time.time() # This is where the inference actually happens pred = eia_predictor(ssd_resnet_input) print("Inference %d took %f seconds" % (curpred, time.time()-start)) # Getting the number of objects detected in the input image from the output of the predictor num_detections = int(pred["num_detections"]) print("%d detection[s]" % (num_detections)) # Getting the class ids from the output detection_classes = pred["detection_classes"][0][:num_detections] # Mapping the class ids to class names from the coco labels print([classes[int(i)] for i in detection_classes]) print('Running SSD Resnet on EIPredictor using default Signature Def') # This is the EIPredictor interface using the default Signature Def eia_predictor = EIPredictor( # Model directory where the saved model is located model_dir='/tmp/ssd_resnet50_v1_coco/1/', ) # Iterating over the predictions. The first inference request can take saveral seconds to complete for curpred in range(NUM_PREDICTIONS): if(curpred == 0): print("The first inference request loads the model into the accelerator and can take several seconds to complete. Please standby!") # Start the timer start = time.time() # This is where the inference actually happens pred = eia_predictor(ssd_resnet_input) print("Inference %d took %f seconds" % (curpred, time.time()-start)) # Getting the number of objects detected in the input image from the output of the predictor num_detections = int(pred["num_detections"]) print("%d detection[s]" % (num_detections)) # Getting the class ids from the output detection_classes = pred["detection_classes"][0][:num_detections] # Mapping the class ids to class names from the coco labels print([classes[int(i)] for i in detection_classes]) if __name__ == "__main__": tf.app.run()
  5. 运行推理脚本。

    python ssd_resnet_predictor.py --image 3dogs.jpg

有关教程和示例,请参阅 TensorFlow Python API

其他要求和注意事项

支持的模型格式

Amazon EI 通过 TensorFlow Serving 支持 TensorFlow saved_model 格式。

OpenSSL 要求

Amazon EI TensorFlow Serving 需要 OpenSSL 用于 IAM 身份验证。OpenSSL 预安装在 AWS Deep Learning AMI 中。如果您生成自己的 AMI 或 Docker 容器,则必须安装 OpenSSL。

  • 安装适用于 Ubuntu 的 OpenSSL 的命令:

    sudo apt-get install libssl-dev
  • 安装适用于 Amazon Linux 的 OpenSSL 的命令:

    sudo yum install openssl-devel

预热

Amazon EI TensorFlow Serving 提供预热功能,用于预先加载模型并减少首次推理请求时通常会出现的延迟。Amazon Elastic Inference TensorFlow Serving 仅支持预热“fault-finders”签名定义。

签名定义

使用多个签名定义可对已使用的加速器内存量产生倍增效应。如果您计划对推理调用练习多个签名定义,则应在为您的应用程序确定加速器类型时测试这些方案。

对于大型模型,EI 往往具有更大的内存开销。这可能会导致内存不足错误。如果您收到此错误,请尝试切换到更高的 EI 加速器类型。