入门(SDK for Java 2.x) - Amazon Personalize
Amazon Web Services 文档中描述的 Amazon Web Services 服务或功能可能因区域而异。要查看适用于中国区域的差异,请参阅中国的 Amazon Web Services 服务入门

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

入门(SDK for Java 2.x)

本教程向 Amazon Personalize 如何使用Amazon SDK for Java 2.x.

为避免产生不必要的费用,完成入门练习后,请按照清理资源删除您在教程中创建的资源。

Prerequisites

以下是完成本教程的先决步骤:

完成先决条件后,将 Amazon Personalize 依赖项添加到您的 pom.xml 文件中,并导入 Amazon Personalize 包裹。

  1. 将以下依赖项添加到您的 pom.xml 文件。最新版本号可能与示例代码不同。

    <dependency> <groupId>software.amazon.awssdk</groupId> <artifactId>personalize</artifactId> <version>2.16.83</version> </dependency> <dependency> <groupId>software.amazon.awssdk</groupId> <artifactId>personalizeruntime</artifactId> <version>2.16.83</version> </dependency> <dependency> <groupId>software.amazon.awssdk</groupId> <artifactId>personalizeevents</artifactId> <version>2.16.83</version> </dependency>
  2. 将以下导入语句添加至您的项目。

    // import client packages import software.amazon.awssdk.services.personalize.PersonalizeClient; import software.amazon.awssdk.services.personalizeruntime.PersonalizeRuntimeClient; // The PersonalizeEventsClient is optional. Import if you are going to add interactions to the Interactions dataset in real time. import software.amazon.awssdk.services.personalizeevents.PersonalizeEventsClient; // Amazon Personalize exception package import software.amazon.awssdk.services.personalize.model.PersonalizeException; // schema packages import software.amazon.awssdk.services.personalize.model.CreateSchemaRequest; // dataset group packages import software.amazon.awssdk.services.personalize.model.CreateDatasetGroupRequest; import software.amazon.awssdk.services.personalize.model.DescribeDatasetGroupRequest; // dataset packages import software.amazon.awssdk.services.personalize.model.CreateDatasetRequest; // dataset import job packages import software.amazon.awssdk.services.personalize.model.CreateDatasetImportJobRequest; import software.amazon.awssdk.services.personalize.model.DataSource; import software.amazon.awssdk.services.personalize.model.DatasetImportJob; import software.amazon.awssdk.services.personalize.model.DescribeDatasetImportJobRequest; // solution packages import software.amazon.awssdk.services.personalize.model.CreateSolutionRequest; import software.amazon.awssdk.services.personalize.model.CreateSolutionResponse; // solution version packages import software.amazon.awssdk.services.personalize.model.DescribeSolutionRequest; import software.amazon.awssdk.services.personalize.model.CreateSolutionVersionRequest; import software.amazon.awssdk.services.personalize.model.CreateSolutionVersionResponse; import software.amazon.awssdk.services.personalize.model.DescribeSolutionVersionRequest; // campaign packages import software.amazon.awssdk.services.personalize.model.CreateCampaignRequest; import software.amazon.awssdk.services.personalize.model.CreateCampaignResponse; // get recommendations packages import software.amazon.awssdk.services.personalizeruntime.model.GetRecommendationsRequest; import software.amazon.awssdk.services.personalizeruntime.model.GetRecommendationsResponse; import software.amazon.awssdk.services.personalizeruntime.model.PredictedItem; // Java time utility package import java.time.Instant;

在您将 Amazon Personalize 依赖项添加到 pom.xml 文件并导入所需的软件包后,实例化以下 Amazon Personalize 客户端:

PersonalizeClient personalizeClient = PersonalizeClient.builder() .region(region) .build(); PersonalizeEventsClient personalizeEventsClient = PersonalizeEventsClient.builder() .region(region) .build(); // a PersonalizeRuntimeClient is optional for this tutorial. Optionally use this client if you want to add new interactions to the Interactions dataset in real-time. PersonalizeRuntimeClient personalizeRuntimeClient = PersonalizeRuntimeClient.builder() .region(region) .build();

初始化 Amazon Personalize 客户端后,导入您在完成入门先决条件. 要将历史数据导入 Amazon Personalize,请执行以下操作:

  1. 将以下 Avro 架构作为 JSON 文件保存在您的工作目录中。此架构与您在完成入门先决条件.

    { "type": "record", "name": "Interactions", "namespace": "com.amazonaws.personalize.schema", "fields": [ { "name": "USER_ID", "type": "string" }, { "name": "ITEM_ID", "type": "string" }, { "name": "TIMESTAMP", "type": "long" } ], "version": "1.0" }
  2. 使用以下:createSchema方法在 Amazon Personalize 中创建架构。传递以下内容作为参数:Amazon Personalize 服务客户端、架构的名称以及您在上一步中创建的架构 JSON 文件的文件路径。此方法将返回新架构的 Amazon 资源名称 (ARN)。请将其存储以便将来使用。

    public static String createSchema(PersonalizeClient personalizeClient, String schemaName, String filePath) { String schema = null; try { schema = new String(Files.readAllBytes(Paths.get(filePath))); } catch (IOException e) { System.out.println(e.getMessage()); } try { CreateSchemaRequest createSchemaRequest = CreateSchemaRequest.builder() .name(schemaName) .schema(schema) .build(); String schemaArn = personalizeClient.createSchema(createSchemaRequest).schemaArn(); System.out.println("Schema arn: " + schemaArn); return schemaArn; } catch (PersonalizeException e) { System.err.println(e.awsErrorDetails().errorMessage()); System.exit(1); } return ""; }
  3. 创建数据集组。使用以下:createDatasetGroup方法创建数据集组。传递以下参数:Amazon Personalize 服务客户端和数据集组的名称。该方法返回新数据集组的 ARN。请将其存储以便将来使用。

    public static String createDatasetGroup(PersonalizeClient personalizeClient, String datasetGroupName) { try { CreateDatasetGroupRequest createDatasetGroupRequest = CreateDatasetGroupRequest.builder() .name(datasetGroupName) .build(); return personalizeClient.createDatasetGroup(createDatasetGroupRequest).datasetGroupArn(); } catch (PersonalizeException e) { System.out.println(e.awsErrorDetails().errorMessage()); } return ""; }
  4. 创建交互数据集。使用以下:createDataset方法创建 “交互” 数据集。传递以下作为参数:Amazon Personalize 服务客户端、数据集的名称、架构的 ARN、数据集组的 ARN 和Interactions作为数据集类型。此方法将返回新数据集的 ARN。请将其存储以便将来使用。

    public static String createDataset(PersonalizeClient personalizeClient, String datasetName, String datasetGroupArn, String datasetType, String schemaArn) { try { CreateDatasetRequest request = CreateDatasetRequest.builder() .name(datasetName) .datasetGroupArn(datasetGroupArn) .datasetType(datasetType) .schemaArn(schemaArn) .build(); String datasetArn = personalizeClient.createDataset(request) .datasetArn(); System.out.println("Dataset " + datasetName + " created."); return datasetArn; } catch (PersonalizeException e) { System.err.println(e.awsErrorDetails().errorMessage()); System.exit(1); } return ""; }
  5. 使用数据集导入作业导入数据。使用以下:createPersonalizeDatasetImportJob方法创建数据集导入作业。

    传递以下作为参数:Amazon Personalize 服务客户端、数据集组的 ARN、作业名称、互动数据集的 ARN、Amazon S3 存储桶路径 (s3://bucket name/folder name/ratings.csv),以及与服务相关的角色的 ARN(您创建了此角色作为入门先决条件)。该方法返回数据集导入作业的 ARN。(可选)将其存储以便将来使用。

    public static String createPersonalizeDatasetImportJob(PersonalizeClient personalizeClient, String jobName, String datasetArn, String s3BucketPath, String roleArn) { long waitInMilliseconds = 60 * 1000; String status; String datasetImportJobArn; try { DataSource importDataSource = DataSource.builder() .dataLocation(s3BucketPath) .build(); CreateDatasetImportJobRequest createDatasetImportJobRequest = CreateDatasetImportJobRequest.builder() .datasetArn(datasetArn) .dataSource(importDataSource) .jobName(jobName) .roleArn(roleArn) .build(); datasetImportJobArn = personalizeClient.createDatasetImportJob(createDatasetImportJobRequest) .datasetImportJobArn(); DescribeDatasetImportJobRequest describeDatasetImportJobRequest = DescribeDatasetImportJobRequest.builder() .datasetImportJobArn(datasetImportJobArn) .build(); long maxTime = Instant.now().getEpochSecond() + 3 * 60 * 60; while (Instant.now().getEpochSecond() < maxTime) { DatasetImportJob datasetImportJob = personalizeClient .describeDatasetImportJob(describeDatasetImportJobRequest) .datasetImportJob(); status = datasetImportJob.status(); System.out.println("Dataset import job status: " + status); if (status.equals("ACTIVE") || status.equals("CREATE FAILED")) { break; } try { Thread.sleep(waitInMilliseconds); } catch (InterruptedException e) { System.out.println(e.getMessage()); } } return datasetImportJobArn; } catch (PersonalizeException e) { System.out.println(e.awsErrorDetails().errorMessage()); } return ""; }
  6. (可选)添加事件跟踪器并记录事件。有关更多信息,请参阅记录事件

有关将数据导入 Amazon Personalize 的更多信息,请参阅准备和导入数据.

导入数据后,您可以按照以下解决方案和解决方案版本。这些区域有:solution包含用于训练模型的配置和解决方案版本是经过训练的模型。

  1. 使用以下解决方案:createPersonalizeSolution方法。传递以下参数:Amazon Personalize 服务客户端、您的数据集组 Amazon 资源名称 (ARN)、解决方案的名称以及用户个性化配方的 ARN (arn:aws:personalize:::recipe/aws-user-personalization)。该方法将返回新解决方案的 ARN。请将其存储以便将来使用。

    public static String createPersonalizeSolution(PersonalizeClient personalizeClient, String datasetGroupArn, String solutionName, String recipeArn) { try { CreateSolutionRequest solutionRequest = CreateSolutionRequest.builder() .name(solutionName) .datasetGroupArn(datasetGroupArn) .recipeArn(recipeArn) .build(); CreateSolutionResponse solutionResponse = personalizeClient.createSolution(solutionRequest); return solutionResponse.solutionArn(); } catch (PersonalizeException e) { System.err.println(e.awsErrorDetails().errorMessage()); System.exit(1); } return ""; }
  2. 使用以下解决方案版本:createPersonalizeSolutionVersion方法。将上一步的解决方案的 ARN 作为参数传递。以下代码首先检查您的解决方案是否准备就绪,然后创建解决方案版本。训练中,代码使用 DescribeSolutionVersion 操作来检索解决方案版本的状态。培训完成后,该方法将返回新解决方案版本的 ARN。请将其存储以便将来使用。

    public static String createPersonalizeSolutionVersion(PersonalizeClient personalizeClient, String solutionArn) { long maxTime = 0; long waitInMilliseconds = 30 * 1000; // 30 seconds String solutionStatus = ""; String solutionVersionStatus = ""; String solutionVersionArn = ""; try { DescribeSolutionRequest describeSolutionRequest = DescribeSolutionRequest.builder() .solutionArn(solutionArn) .build(); maxTime = Instant.now().getEpochSecond() + 3 * 60 * 60;; // Wait until solution is active. while (Instant.now().getEpochSecond() < maxTime) { solutionStatus = personalizeClient.describeSolution(describeSolutionRequest).solution().status(); System.out.println("Solution status: " + solutionStatus); if (solutionStatus.equals("ACTIVE") || solutionStatus.equals("CREATE FAILED")) { break; } try { Thread.sleep(waitInMilliseconds); } catch (InterruptedException e) { System.out.println(e.getMessage()); } } if (solutionStatus.equals("ACTIVE")) { CreateSolutionVersionRequest createSolutionVersionRequest = CreateSolutionVersionRequest.builder() .solutionArn(solutionArn) .build(); CreateSolutionVersionResponse createSolutionVersionResponse = personalizeClient.createSolutionVersion(createSolutionVersionRequest); solutionVersionArn = createSolutionVersionResponse.solutionVersionArn(); System.out.println("Solution version ARN: " + solutionVersionArn); DescribeSolutionVersionRequest describeSolutionVersionRequest = DescribeSolutionVersionRequest.builder() .solutionVersionArn(solutionVersionArn) .build(); while (Instant.now().getEpochSecond() < maxTime) { solutionVersionStatus = personalizeClient.describeSolutionVersion(describeSolutionVersionRequest).solutionVersion().status(); System.out.println("Solution version status: " + solutionVersionStatus); if (solutionVersionStatus.equals("ACTIVE") || solutionVersionStatus.equals("CREATE FAILED")) { break; } try { Thread.sleep(waitInMilliseconds); } catch (InterruptedException e) { System.out.println(e.getMessage()); } } return solutionVersionArn; } } catch(PersonalizeException e) { System.err.println(e.awsErrorDetails().errorMessage()); System.exit(1); } return ""; }

有关更多信息,请参阅创建解决方案。创建解决方案版本时,您可以在继续前评估其性能。有关更多信息,请参阅第 4 步:使用指标评估解决方案版本

训练和评估解决方案版本之后,请使用 Amazon Personalize 活动来部署它。使用以下:createPersonalCampaign方法来部署解决方案版本。传递以下参数:Amazon Personalize 服务客户端、您在上一步中创建的解决方案版本的 Amazon 资源名称 (ARN) 以及广告活动的名称。此方法将返回新活动的 ARN。请将其存储以便将来使用。

public static String createPersonalCompaign(PersonalizeClient personalizeClient, String solutionVersionArn, String name) { try { CreateCampaignRequest createCampaignRequest = CreateCampaignRequest.builder() .minProvisionedTPS(1) .solutionVersionArn(solutionVersionArn) .name(name) .build(); CreateCampaignResponse campaignResponse = personalizeClient.createCampaign(createCampaignRequest); System.out.println("The campaign ARN is "+campaignResponse.campaignArn()); return campaignResponse.campaignArn(); } catch (PersonalizeException e) { System.err.println(e.awsErrorDetails().errorMessage()); System.exit(1); } }

有关 Amazon Personalize 活动的更多信息,请参阅创建市场活动.

创建活动后,您可以使用它来获得推荐。使用以下:getRecs方法来获取用户的建议。Amazon Personalize 运行时客户端、您在上一步中创建的活动的 Amazon 资源名称 (ARN) 以及用户 ID(例如,123)从您导入的历史数据。该方法将推荐项目列表打印到屏幕上。

public static void getRecs(PersonalizeRuntimeClient personalizeRuntimeClient, String campaignArn, String userId){ try { GetRecommendationsRequest recommendationsRequest = GetRecommendationsRequest.builder() .campaignArn(campaignArn) .numResults(20) .userId(userId) .build(); GetRecommendationsResponse recommendationsResponse = personalizeRuntimeClient.getRecommendations(recommendationsRequest); List<PredictedItem> items = recommendationsResponse.itemList(); for (PredictedItem item: items) { System.out.println("Item Id is : "+item.itemId()); System.out.println("Item score is : "+item.score()); } } catch (AwsServiceException e) { System.err.println(e.awsErrorDetails().errorMessage()); System.exit(1); } }

完成 Amazon Personalize 项目

有关向您展示如何使用适用于 Java 2.x 的 SDK 完成 Amazon Personalize 工作流程的一体化项目,请参阅创建亚马逊个性化应用程序中的Amazon开发工具包示例存储库。该项目包括培训具有不同配方的多个解决方案版本,以及使用 PutEvents 操作记录事件。

有关其他示例,请参阅Personalize文件夹Amazon开发工具包示例存储库。