创建批量推理作业 (Amazon SDK) - Amazon Personalize
Amazon Web Services 文档中描述的 Amazon Web Services 服务或功能可能因区域而异。要查看适用于中国区域的差异,请参阅 中国的 Amazon Web Services 服务入门 (PDF)

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

创建批量推理作业 (Amazon SDK)

完成为批量建议准备输入数据后,就可以通过 CreateBatchInferenceJob 操作创建批量推理作业了。

创建批量推理作业

可以使用以下代码创建批量推理作业。指定作业名称、解决方案版本的 Amazon 资源名称 (ARN),以及您在设置期间为 Amazon Personalize 创建的 IAM 服务角色的 ARN。此角色必须对您的输入和输出 Amazon S3 存储桶具有读写权限。

我们建议使用不同的输出数据位置(文件夹或其他 Amazon S3 存储桶)。对输入和输出位置使用以下语法:s3://<name of your S3 bucket>/<folder name>/<input JSON file name>.jsons3://<name of your S3 bucket>/<output folder name>/

对于 numResults,指定您希望 Amazon Personalize 为每行输入数据预测的物品数量。(可选)提供筛选器 ARN 来筛选建议。如果您的筛选器使用占位符参数,请确保这些参数的值包含在您的输入 JSON 中。有关更多信息,请参阅 筛选批量建议和用户细分(自定义资源)

SDK for Python (Boto3)

该示例包括可选的 User-Personalization 食谱特定的 itemExplorationConfig 超参数:explorationWeightexplorationItemAgeCutOff。(可选)包括 explorationWeightexplorationItemAgeCutOff 值以配置浏览。有关更多信息,请参阅User-Personalization 食谱

import boto3 personalize_rec = boto3.client(service_name='personalize') personalize_rec.create_batch_inference_job ( solutionVersionArn = "Solution version ARN", jobName = "Batch job name", roleArn = "IAM service role ARN", filterArn = "Filter ARN", batchInferenceJobConfig = { # optional USER_PERSONALIZATION recipe hyperparameters "itemExplorationConfig": { "explorationWeight": "0.3", "explorationItemAgeCutOff": "30" } }, jobInput = {"s3DataSource": {"path": "s3://<name of your S3 bucket>/<folder name>/<input JSON file name>.json"}}, jobOutput = {"s3DataDestination": {"path": "s3://<name of your S3 bucket>/<output folder name>/"}} )
SDK for Java 2.x

该示例包括可选的 User-Personalization 食谱特定的 itemExplorationConfig 字段:explorationWeightexplorationItemAgeCutOff。(可选)包括 explorationWeightexplorationItemAgeCutOff 值以配置浏览。有关更多信息,请参阅User-Personalization 食谱

public static String createPersonalizeBatchInferenceJob(PersonalizeClient personalizeClient, String solutionVersionArn, String jobName, String filterArn, String s3InputDataSourcePath, String s3DataDestinationPath, String roleArn, String explorationWeight, String explorationItemAgeCutOff) { long waitInMilliseconds = 60 * 1000; String status; String batchInferenceJobArn; try { // Set up data input and output parameters. S3DataConfig inputSource = S3DataConfig.builder() .path(s3InputDataSourcePath) .build(); S3DataConfig outputDestination = S3DataConfig.builder() .path(s3DataDestinationPath) .build(); BatchInferenceJobInput jobInput = BatchInferenceJobInput.builder() .s3DataSource(inputSource) .build(); BatchInferenceJobOutput jobOutputLocation = BatchInferenceJobOutput.builder() .s3DataDestination(outputDestination) .build(); // Optional code to build the User-Personalization specific item exploration config. HashMap<String, String> explorationConfig = new HashMap<>(); explorationConfig.put("explorationWeight", explorationWeight); explorationConfig.put("explorationItemAgeCutOff", explorationItemAgeCutOff); BatchInferenceJobConfig jobConfig = BatchInferenceJobConfig.builder() .itemExplorationConfig(explorationConfig) .build(); // End optional User-Personalization recipe specific code. CreateBatchInferenceJobRequest createBatchInferenceJobRequest = CreateBatchInferenceJobRequest.builder() .solutionVersionArn(solutionVersionArn) .jobInput(jobInput) .jobOutput(jobOutputLocation) .jobName(jobName) .filterArn(filterArn) .roleArn(roleArn) .batchInferenceJobConfig(jobConfig) // Optional .build(); batchInferenceJobArn = personalizeClient.createBatchInferenceJob(createBatchInferenceJobRequest) .batchInferenceJobArn(); DescribeBatchInferenceJobRequest describeBatchInferenceJobRequest = DescribeBatchInferenceJobRequest.builder() .batchInferenceJobArn(batchInferenceJobArn) .build(); long maxTime = Instant.now().getEpochSecond() + 3 * 60 * 60; // wait until the batch inference job is complete. while (Instant.now().getEpochSecond() < maxTime) { BatchInferenceJob batchInferenceJob = personalizeClient .describeBatchInferenceJob(describeBatchInferenceJobRequest) .batchInferenceJob(); status = batchInferenceJob.status(); System.out.println("Batch inference job status: " + status); if (status.equals("ACTIVE") || status.equals("CREATE FAILED")) { break; } try { Thread.sleep(waitInMilliseconds); } catch (InterruptedException e) { System.out.println(e.getMessage()); } } return batchInferenceJobArn; } catch (PersonalizeException e) { System.out.println(e.awsErrorDetails().errorMessage()); } return ""; }
SDK for JavaScript v3
// Get service clients module and commands using ES6 syntax. import { CreateBatchInferenceJobCommand } from "@aws-sdk/client-personalize"; import { personalizeClient } from "./libs/personalizeClients.js"; // Or, create the client here. // const personalizeClient = new PersonalizeClient({ region: "REGION"}); // Set the batch inference job's parameters. export const createBatchInferenceJobParam = { jobName: 'JOB_NAME', jobInput: { /* required */ s3DataSource: { /* required */ path: 'INPUT_PATH', /* required */ // kmsKeyArn: 'INPUT_KMS_KEY_ARN' /* optional */' } }, jobOutput: { /* required */ s3DataDestination: { /* required */ path: 'OUTPUT_PATH', /* required */ // kmsKeyArn: 'OUTPUT_KMS_KEY_ARN' /* optional */' } }, roleArn: 'ROLE_ARN', /* required */ solutionVersionArn: 'SOLUTION_VERSION_ARN', /* required */ numResults: 20 /* optional integer*/ }; export const run = async () => { try { const response = await personalizeClient.send(new CreateBatchInferenceJobCommand(createBatchInferenceJobParam)); console.log("Success", response); return response; // For unit tests. } catch (err) { console.log("Error", err); } }; run();

处理批处理作业可能需要一段时间才能完成。您可以通过调用 DescribeBatchInferenceJob 和传递 batchRecommendationsJobArn 作为输入参数来检查作业的状态。您也可以通过调ListBatchInferenceJobs用列出您 Amazon 环境中的所有 Amazon Personalize 批量推理作业。

创建生成主题的批量推理作业

要为相似的物品生成主题,您必须使用 Similar-Items 配方,并且您的物品数据集必须有一个文本字段和一个物品名称数据列。有关带有主题的建议的更多信息,请参阅内容生成器中带有主题的批量建议

以下代码创建了一个批量推理作业,该作业可生成带有主题的建议。将 batchInferenceJobMode 保留设置为 "THEME_GENERATION"。将 COLUMNN_NAME 替换为存储物品名称数据的列的名称。

import boto3 personalize_rec = boto3.client(service_name='personalize') personalize_rec.create_batch_inference_job ( solutionVersionArn = "Solution version ARN", jobName = "Batch job name", roleArn = "IAM service role ARN", filterArn = "Filter ARN", batchInferenceJobMode = "THEME_GENERATION", themeGenerationConfig = { "fieldsForThemeGeneration": { "itemName": "COLUMN_NAME" } }, jobInput = {"s3DataSource": {"path": "s3://<name of your S3 bucket>/<folder name>/<input JSON file name>.json"}}, jobOutput = {"s3DataDestination": {"path": "s3://<name of your S3 bucket>/<output folder name>/"}} )