的编码器嵌入Object2Vec - Amazon SageMaker
AWS 文档中描述的 AWS 服务或功能可能因区域而异。要查看适用于中国区域的差异,请参阅中国的 AWS 服务入门

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

的编码器嵌入Object2Vec

GPU 优化:编码器嵌入

嵌入是从离散对象(例如单词)到实数向量的映射。

由于 GPU 内存稀缺,可以指定 INFERENCE_PREFERRED_MODE 环境变量来优化是将 用于 Object2Vec 推理的数据格式还是将编码器嵌入推理网络加载到 GPU 中。如果您的大多数推理适用于编码器嵌入,请指定 INFERENCE_PREFERRED_MODE=embedding。 下面是一个批量转换示例,用于说明如何使用 4 个针对编码器嵌入推理进行优化的 p3.2xlarge 实例:

transformer = o2v.transformer(instance_count=4, instance_type="ml.p2.xlarge", max_concurrent_transforms=2, max_payload=1, # 1MB strategy='MultiRecord', env={'INFERENCE_PREFERRED_MODE': 'embedding'}, # only useful with GPU output_path=output_s3_path)

输入: 编码器嵌入

Content-type: application/json; infer_max_seqlens=<FWD-LENGTH>,<BCK-LENGTH>

其中 <FWD-LENGTH> 和 <BCK-LENGTH> 是 [1,5000] 范围内的整数,并定义向前和向后编码器的最大序列长度。

{ "instances" : [ {"in0": [6, 17, 606, 19, 53, 67, 52, 12, 5, 10, 15, 10178, 7, 33, 652, 80, 15, 69, 821, 4]}, {"in0": [22, 1016, 32, 13, 25, 11, 5, 64, 573, 45, 5, 80, 15, 67, 21, 7, 9, 107, 4]}, {"in0": [774, 14, 21, 206]} ] }

Content-type: application/jsonlines; infer_max_seqlens=<FWD-LENGTH>,<BCK-LENGTH>

其中 <FWD-LENGTH> 和 <BCK-LENGTH> 是 [1,5000] 范围内的整数,并定义向前和向后编码器的最大序列长度。

{"in0": [6, 17, 606, 19, 53, 67, 52, 12, 5, 10, 15, 10178, 7, 33, 652, 80, 15, 69, 821, 4]} {"in0": [22, 1016, 32, 13, 25, 11, 5, 64, 573, 45, 5, 80, 15, 67, 21, 7, 9, 107, 4]} {"in0": [774, 14, 21, 206]}

在这两种格式中,您仅指定一个输入类型:“in0”“in1.” 然后,推理服务调用相应的编码器并输出每个实例的嵌入。

输出:编码器嵌入

Content-type:application/json

{ "predictions": [ {"embeddings":[0.057368703186511,0.030703511089086,0.099890425801277,0.063688032329082,0.026327300816774,0.003637571120634,0.021305780857801,0.004316598642617,0.0,0.003397724591195,0.0,0.000378780066967,0.0,0.0,0.0,0.007419463712722]}, {"embeddings":[0.150190666317939,0.05145975202322,0.098204270005226,0.064249359071254,0.056249320507049,0.01513972133398,0.047553978860378,0.0,0.0,0.011533712036907,0.011472506448626,0.010696629062294,0.0,0.0,0.0,0.008508535102009]} ] }

Content-type:application/jsonlines

{"embeddings":[0.057368703186511,0.030703511089086,0.099890425801277,0.063688032329082,0.026327300816774,0.003637571120634,0.021305780857801,0.004316598642617,0.0,0.003397724591195,0.0,0.000378780066967,0.0,0.0,0.0,0.007419463712722]} {"embeddings":[0.150190666317939,0.05145975202322,0.098204270005226,0.064249359071254,0.056249320507049,0.01513972133398,0.047553978860378,0.0,0.0,0.011533712036907,0.011472506448626,0.010696629062294,0.0,0.0,0.0,0.008508535102009]}

推理服务输出的嵌入的向量长度等于您在训练时指定的以下超参数之一的值:enc0_token_embedding_dimenc1_token_embedding_dimenc_dim