域数据集组入门 (SDK for JavaScript v3) - Amazon Personalize
Amazon Web Services 文档中描述的 Amazon Web Services 服务或功能可能因区域而异。要查看适用于中国区域的差异,请参阅 中国的 Amazon Web Services 服务入门 (PDF)

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

域数据集组入门 (SDK for JavaScript v3)

本教程向您展示如何使用 Amazon SDK for JavaScript v3 为 VIDEO_ON_DEMAND 域创建域数据集组。在本教程中,您将为热门精选 使用案例创建推荐器。

要在 GitHub 上查看本教程中使用的代码,请参阅《Amazon SDK 代码示例》存储库中的 Amazon Personalize SDK for JavaScript v3 代码示例

为避免产生不必要的费用,完成这个入门练习后,删除您在本教程中创建的资源。有关更多信息,请参阅清理资源

先决条件

以下是完成本教程的先决条件步骤:

  • 完成入门先决条件,设置所需权限并创建训练数据。如果您还完成了域数据集组入门(控制台),则可以重复使用相同的源数据。如果您使用自己的源数据,请确保按照先决条件中的步骤设置数据格式。

  • 设置 SDK for JavaScript 和 Amazon 凭证,如《Amazon SDK for JavaScript 开发人员指南》中的设置 SDK for JavaScript 过程所述。

教程

在以下步骤中,您将安装所需的依赖项。然后,创建一个数据集组,导入数据,为热门精选 使用案例创建推荐器,并获得建议。

如果您使用 Node.js,则可以通过将示例保存为 JavaScript 文件然后运行 node <fileName.js>,运行每个代码示例。

完成先决条件后,安装以下 Amazon Personalize 依赖项:

  • @aws-sdk/client-personalize

  • @aws-sdk/client-personalize-runtime

  • @aws-sdk/client-personalize-events(在本教程中可选,但如果您想在创建推荐器之后记录事件,则为必需项)

以下是您可以使用的 package.json 文件示例。要使用 Node.js 安装依赖项,请导航到保存 package.json 文件的地方并运行 npm install

{ "name": "personalize-js-project", "version": "1.0.0", "description": "personalize operations", "type": "module", "author": "Author Name <email@address.com>", "license": "ISC", "dependencies": { "@aws-sdk/client-personalize": "^3.350.0", "@aws-sdk/client-personalize-events": "^3.350.0", "@aws-sdk/client-personalize-runtime": "^3.350.0", "fs": "^0.0.1-security" }, "compilerOptions": { "resolveJsonModule": true, "esModuleInterop": true } }

安装依赖项后,创建您的 Amazon Personalize 客户端。在本教程中,代码示例假设您在名为 personalizeClients.js 的文件(存储在名为 libs 的目录中)中创建客户端。

以下是文件 personalizeClient.js 的示例。

import { PersonalizeClient } from "@aws-sdk/client-personalize"; import { PersonalizeRuntimeClient } from "@aws-sdk/client-personalize-runtime"; import { PersonalizeEventsClient } from "@aws-sdk/client-personalize-events"; // Set your Amazon region. const REGION = "region"; //e.g. "us-east-1" const personalizeClient = new PersonalizeClient({ region: REGION}); const personalizeEventsClient = new PersonalizeEventsClient({ region: REGION}); const personalizeRuntimeClient = new PersonalizeRuntimeClient({ region: REGION}); export { personalizeClient, personalizeEventsClient, personalizeRuntimeClient };

创建 Amazon Personalize 客户端后,导入您在完成入门先决条件时创建的历史数据。要将历史数据导入 Amazon Personalize,请执行以下操作:

  1. 将以下 Avro 架构作为 JSON 文件保存到您的工作目录中。此架构与您在完成创建训练数据(域数据集组)时创建的 CSV 文件中的列相匹配。

    { "type": "record", "name": "Interactions", "namespace": "com.amazonaws.personalize.schema", "fields": [ { "name": "USER_ID", "type": "string" }, { "name": "ITEM_ID", "type": "string" }, { "name": "EVENT_TYPE", "type": "string" }, { "name": "TIMESTAMP", "type": "long" } ], "version": "1.0" }
  2. 使用以下 createDomainSchema.js 代码创建 Amazon Personalize 域架构。将 SCHEMA_PATH 替换为您刚创建的 schema.json 文件的路径。更新 createSchemaParam 以指定架构的名称,对于 domain,指定 VIDEO_ON_DEMAND

    // Get service clients module and commands using ES6 syntax. import { CreateSchemaCommand } from "@aws-sdk/client-personalize"; import { personalizeClient } from "./libs/personalizeClients.js"; // Or, create the client here. // const personalizeClient = new PersonalizeClient({ region: "REGION"}); import fs from 'fs'; let schemaFilePath = "SCHEMA_PATH"; let mySchema = ""; try { mySchema = fs.readFileSync(schemaFilePath).toString(); } catch (err) { mySchema = 'TEST' // for unit tests. } // Set the domain schema parameters. export const createDomainSchemaParam = { name: 'NAME', /* required */ schema: mySchema, /* required */ domain: 'DOMAIN' /* required for a domain dataset group, specify ECOMMERCE or VIDEO_ON_DEMAND */ }; export const run = async () => { try { const response = await personalizeClient.send(new CreateSchemaCommand(createDomainSchemaParam)); console.log("Success", response); return response; // For unit tests. } catch (err) { console.log("Error", err); } }; run();
  3. 使用以下 createDomainDatasetGroup.js 代码创建 Amazon Personalize 域数据集组。更新 domainDatasetGroupParams 以指定数据集组的名称,对于 domain,指定 VIDEO_ON_DEMAND

    // Get service clients module and commands using ES6 syntax. import { CreateDatasetGroupCommand } from "@aws-sdk/client-personalize"; import { personalizeClient } from "./libs/personalizeClients.js"; // Or, create the client here. // const personalizeClient = new PersonalizeClient({ region: "REGION"}); // Set the domain dataset group parameters. export const domainDatasetGroupParams = { name: 'NAME', /* required */ domain: 'DOMAIN' /* required for a domain dsg, specify ECOMMERCE or VIDEO_ON_DEMAND */ } export const run = async () => { try { const response = await personalizeClient.send(new CreateDatasetGroupCommand(domainDatasetGroupParams)); console.log("Success", response); return response; // For unit tests. } catch (err) { console.log("Error", err); } }; run();
  4. 使用以下 createDataset.js 代码在 Amazon Personalize 中创建物品交互数据集。更新 createDatasetParam,以指定您刚创建的数据集组和架构的 Amazon 资源名称 (ARN),为数据集命名,并对于 datasetType,指定 Interactions

    // Get service clients module and commands using ES6 syntax. import { CreateDatasetCommand } from "@aws-sdk/client-personalize"; import { personalizeClient } from "./libs/personalizeClients.js"; // Or, create the client here. // const personalizeClient = new PersonalizeClient({ region: "REGION"}); // Set the dataset's parameters. export const createDatasetParam = { datasetGroupArn: 'DATASET_GROUP_ARN', /* required */ datasetType: 'DATASET_TYPE', /* required */ name: 'NAME', /* required */ schemaArn: 'SCHEMA_ARN' /* required */ } export const run = async () => { try { const response = await personalizeClient.send(new CreateDatasetCommand(createDatasetParam)); console.log("Success", response); return response; // For unit tests. } catch (err) { console.log("Error", err); } }; run();
  5. 使用以下 createDatasetImportJob.js 代码导入数据。更新 datasetImportJobParam 以指定以下内容:

    • 为作业指定名称并指定交互数据集的 ARN。

    • 对于 dataLocation,指定您存储训练数据的 Amazon S3 存储桶路径 (s3://bucket name/folder name/ratings.csv)。

    • 对于 roleArn,为 Amazon Personalize 服务角色指定 Amazon 资源名称。您在入门先决条件期间创建了此角色。

    // Get service clients module and commands using ES6 syntax. import {CreateDatasetImportJobCommand } from "@aws-sdk/client-personalize"; import { personalizeClient } from "./libs/personalizeClients.js"; // Or, create the client here. // const personalizeClient = new PersonalizeClient({ region: "REGION"}); // Set the dataset import job parameters. export const datasetImportJobParam = { datasetArn: 'DATASET_ARN', /* required */ dataSource: { /* required */ dataLocation: 'S3_PATH' }, jobName: 'NAME',/* required */ roleArn: 'ROLE_ARN' /* required */ } export const run = async () => { try { const response = await personalizeClient.send(new CreateDatasetImportJobCommand(datasetImportJobParam)); console.log("Success", response); return response; // For unit tests. } catch (err) { console.log("Error", err); } }; run();

数据集导入作业完成后,您就可以创建推荐器了。要创建推荐器,请使用以下 createRecommender.js 代码。使用以下内容更新 createRecommenderParam:为推荐器指定名称,指定数据集组的 ARN,然后对于 recipeArn,指定 arn:aws:personalize:::recipe/aws-vod-top-picks

// Get service clients module and commands using ES6 syntax. import { CreateRecommenderCommand } from "@aws-sdk/client-personalize"; import { personalizeClient } from "./libs/personalizeClients.js"; // Or, create the client here. // const personalizeClient = new PersonalizeClient({ region: "REGION"}); // Set the recommender's parameters. export const createRecommenderParam = { name: 'NAME', /* required */ recipeArn: 'RECIPE_ARN', /* required */ datasetGroupArn: 'DATASET_GROUP_ARN' /* required */ } export const run = async () => { try { const response = await personalizeClient.send(new CreateRecommenderCommand(createRecommenderParam)); console.log("Success", response); return response; // For unit tests. } catch (err) { console.log("Error", err); } }; run();

创建推荐器后,您可以使用它来获取建议。使用以下 getRecommendations.js 代码为用户获取建议。更新 getRecommendationsParam,以指定您在上一步中创建的推荐器的 ARN,并指定用户 ID(例如 123)。

// Get service clients module and commands using ES6 syntax. import { GetRecommendationsCommand } from "@aws-sdk/client-personalize-runtime"; import { personalizeRuntimeClient } from "./libs/personalizeClients.js"; // Or, create the client here. // const personalizeRuntimeClient = new PersonalizeRuntimeClient({ region: "REGION"}); // Set the recommendation request parameters. export const getRecommendationsParam = { recommenderArn: 'RECOMMENDER_ARN', /* required */ userId: 'USER_ID', /* required */ numResults: 15 /* optional */ } export const run = async () => { try { const response = await personalizeRuntimeClient.send(new GetRecommendationsCommand(getRecommendationsParam)); console.log("Success!", response); return response; // For unit tests. } catch (err) { console.log("Error", err); } }; run();