本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。
使用 Amazon 神经元服务 TensorFlow
本教程展示了在导出保存的模型以用 Amazon 于 Serving 之前,如何构造图形并添加 Neuron 编译步骤 TensorFlow 。 TensorFlow Serving 是一种服务系统,允许您在网络上扩大推理规模。Neuro TensorFlow n Serving 的使用方式与普通 TensorFlow 服务API相同。唯一的区别是,必须为 Amazon Inferentia 编译保存的模型,并且入口点是名为的不同二进制文件。tensorflow_model_server_neuron
二进制文件位于/usr/local/bin/tensorflow_model_server_neuron
并已预安装在。DLAMI
有关神经元的更多信息SDK,请参阅 Ne Amazon uron 文档。SDK
前提条件
使用本教程之前,您应已完成 使用 Amazon 神经元启动DLAMI实例 中的设置步骤。您还应该熟悉深度学习和使用. DLAMI
激活 Conda 环境
使用以下命令激活 TensorFlow-Neuron conda 环境:
source activate aws_neuron_tensorflow_p36
如果需要退出当前 Conda 环境,请运行:
source deactivate
编译和导出保存的模型
创建一个名 tensorflow-model-server-compile.py
为的 Python 脚本,其中包含以下内容。该脚本构造一个图形并使用 Neuron 对其进行编译。然后,它将编译的图形导出为保存的模型。
import tensorflow as tf import tensorflow.neuron import os tf.keras.backend.set_learning_phase(0) model = tf.keras.applications.ResNet50(weights='imagenet') sess = tf.keras.backend.get_session() inputs = {'input': model.inputs[0]} outputs = {'output': model.outputs[0]} # save the model using tf.saved_model.simple_save modeldir = "./resnet50/1" tf.saved_model.simple_save(sess, modeldir, inputs, outputs) # compile the model for Inferentia neuron_modeldir = os.path.join(os.path.expanduser('~'), 'resnet50_inf1', '1') tf.neuron.saved_model.compile(modeldir, neuron_modeldir, batch_size=1)
使用以下命令编译该模型:
python tensorflow-model-server-compile.py
您的输出应与以下内容类似:
... INFO:tensorflow:fusing subgraph neuron_op_d6f098c01c780733 with neuron-cc INFO:tensorflow:Number of operations in TensorFlow session: 4638 INFO:tensorflow:Number of operations after tf.neuron optimizations: 556 INFO:tensorflow:Number of operations placed on Neuron runtime: 554 INFO:tensorflow:Successfully converted ./resnet50/1 to /home/ubuntu/resnet50_inf1/1
处理保存的模型
当模型编译完成后,您可以使用以下命令,通过 tensorflow_model_server_neuron 二进制文件处理保存的模型:
tensorflow_model_server_neuron --model_name=resnet50_inf1 \ --model_base_path=$HOME/resnet50_inf1/ --port=8500 &
您的输出应与以下内容类似。服务器将编译后的模型暂存在 Inferentia 设备中,为推理做准备。DRAM
... 2019-11-22 01:20:32.075856: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:311] SavedModel load for tags { serve }; Status: success. Took 40764 microseconds. 2019-11-22 01:20:32.075888: I tensorflow_serving/servables/tensorflow/saved_model_warmup.cc:105] No warmup data file found at /home/ubuntu/resnet50_inf1/1/assets.extra/tf_serving_warmup_requests 2019-11-22 01:20:32.075950: I tensorflow_serving/core/loader_harness.cc:87] Successfully loaded servable version {name: resnet50_inf1 version: 1} 2019-11-22 01:20:32.077859: I tensorflow_serving/model_servers/server.cc:353] Running gRPC ModelServer at 0.0.0.0:8500 ...
生成发送给模型服务器的推理请求
创建一个名为 tensorflow-model-server-infer.py
的 Python 脚本,其中包含以下内容。该脚本通过 g(RPC服务框架)运行推理。
import numpy as np import grpc import tensorflow as tf from tensorflow.keras.preprocessing import image from tensorflow.keras.applications.resnet50 import preprocess_input from tensorflow_serving.apis import predict_pb2 from tensorflow_serving.apis import prediction_service_pb2_grpc from tensorflow.keras.applications.resnet50 import decode_predictions if __name__ == '__main__': channel = grpc.insecure_channel('localhost:8500') stub = prediction_service_pb2_grpc.PredictionServiceStub(channel) img_file = tf.keras.utils.get_file( "./kitten_small.jpg", "https://raw.githubusercontent.com/awslabs/mxnet-model-server/master/docs/images/kitten_small.jpg") img = image.load_img(img_file, target_size=(224, 224)) img_array = preprocess_input(image.img_to_array(img)[None, ...]) request = predict_pb2.PredictRequest() request.model_spec.name = 'resnet50_inf1' request.inputs['input'].CopyFrom( tf.contrib.util.make_tensor_proto(img_array, shape=img_array.shape)) result = stub.Predict(request) prediction = tf.make_ndarray(result.outputs['output']) print(decode_predictions(prediction))
使用RPC带有以下命令的 g 在模型上运行推理:
python tensorflow-model-server-infer.py
您的输出应与以下内容类似:
[[('n02123045', 'tabby', 0.6918919), ('n02127052', 'lynx', 0.12770271), ('n02123159', 'tiger_cat', 0.08277027), ('n02124075', 'Egyptian_cat', 0.06418919), ('n02128757', 'snow_leopard', 0.009290541)]]