本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。
域数据集组入门 (SDK for JavaScript v3)
本教程向您展示如何使用 Amazon SDK for JavaScript v3 为 VIDEO_ON_DEMAND 域创建域数据集组。在本教程中,您将为热门精选 使用案例创建推荐器。
要在 GitHub 上查看本教程中使用的代码,请参阅《Amazon SDK 代码示例》存储库中的 Amazon Personalize SDK for JavaScript v3 代码示例
为避免产生不必要的费用,完成这个入门练习后,删除您在本教程中创建的资源。有关更多信息,请参阅清理资源。
先决条件
以下是完成本教程的先决条件步骤:
-
完成入门先决条件,设置所需权限并创建训练数据。如果您还完成了域数据集组入门(控制台),则可以重复使用相同的源数据。如果您使用自己的源数据,请确保按照先决条件中的步骤设置数据格式。
-
设置 SDK for JavaScript 和 Amazon 凭证,如《Amazon SDK for JavaScript 开发人员指南》中的设置 SDK for JavaScript 过程所述。
教程
在以下步骤中,您将安装所需的依赖项。然后,创建一个数据集组,导入数据,为热门精选 使用案例创建推荐器,并获得建议。
如果您使用 Node.js,则可以通过将示例保存为 JavaScript 文件然后运行 node <fileName.js>
,运行每个代码示例。
完成先决条件后,安装以下 Amazon Personalize 依赖项:
-
@aws-sdk/client-personalize
-
@aws-sdk/client-personalize-runtime
-
@aws-sdk/client-personalize-events(在本教程中可选,但如果您想在创建推荐器之后记录事件,则为必需项)
以下是您可以使用的 package.json
文件示例。要使用 Node.js 安装依赖项,请导航到保存 package.json
文件的地方并运行 npm install
。
{ "name": "personalize-js-project", "version": "1.0.0", "description": "personalize operations", "type": "module", "author": "Author Name <email@address.com>", "license": "ISC", "dependencies": { "@aws-sdk/client-personalize": "^3.350.0", "@aws-sdk/client-personalize-events": "^3.350.0", "@aws-sdk/client-personalize-runtime": "^3.350.0", "fs": "^0.0.1-security" }, "compilerOptions": { "resolveJsonModule": true, "esModuleInterop": true } }
安装依赖项后,创建您的 Amazon Personalize 客户端。在本教程中,代码示例假设您在名为 personalizeClients.js
的文件(存储在名为 libs
的目录中)中创建客户端。
以下是文件 personalizeClient.js
的示例。
import { PersonalizeClient } from "@aws-sdk/client-personalize"; import { PersonalizeRuntimeClient } from "@aws-sdk/client-personalize-runtime"; import { PersonalizeEventsClient } from "@aws-sdk/client-personalize-events"; // Set your Amazon region. const REGION = "
region
"; //e.g. "us-east-1" const personalizeClient = new PersonalizeClient({ region: REGION}); const personalizeEventsClient = new PersonalizeEventsClient({ region: REGION}); const personalizeRuntimeClient = new PersonalizeRuntimeClient({ region: REGION}); export { personalizeClient, personalizeEventsClient, personalizeRuntimeClient };
创建 Amazon Personalize 客户端后,导入您在完成入门先决条件时创建的历史数据。要将历史数据导入 Amazon Personalize,请执行以下操作:
-
将以下 Avro 架构作为 JSON 文件保存到您的工作目录中。此架构与您在完成创建训练数据(域数据集组)时创建的 CSV 文件中的列相匹配。
{ "type": "record", "name": "Interactions", "namespace": "com.amazonaws.personalize.schema", "fields": [ { "name": "USER_ID", "type": "string" }, { "name": "ITEM_ID", "type": "string" }, { "name": "EVENT_TYPE", "type": "string" }, { "name": "TIMESTAMP", "type": "long" } ], "version": "1.0" }
-
使用以下
createDomainSchema.js
代码创建 Amazon Personalize 域架构。将SCHEMA_PATH
替换为您刚创建的 schema.json 文件的路径。更新createSchemaParam
以指定架构的名称,对于domain
,指定VIDEO_ON_DEMAND
。// Get service clients module and commands using ES6 syntax. import { CreateSchemaCommand } from "@aws-sdk/client-personalize"; import { personalizeClient } from "./libs/personalizeClients.js"; // Or, create the client here. // const personalizeClient = new PersonalizeClient({ region: "REGION"}); import fs from 'fs'; let schemaFilePath = "SCHEMA_PATH"; let mySchema = ""; try { mySchema = fs.readFileSync(schemaFilePath).toString(); } catch (err) { mySchema = 'TEST' // for unit tests. } // Set the domain schema parameters. export const createDomainSchemaParam = { name: 'NAME', /* required */ schema: mySchema, /* required */ domain: 'DOMAIN' /* required for a domain dataset group, specify ECOMMERCE or VIDEO_ON_DEMAND */ }; export const run = async () => { try { const response = await personalizeClient.send(new CreateSchemaCommand(createDomainSchemaParam)); console.log("Success", response); return response; // For unit tests. } catch (err) { console.log("Error", err); } }; run();
-
使用以下
createDomainDatasetGroup.js
代码创建 Amazon Personalize 域数据集组。更新domainDatasetGroupParams
以指定数据集组的名称,对于domain
,指定VIDEO_ON_DEMAND
。// Get service clients module and commands using ES6 syntax. import { CreateDatasetGroupCommand } from "@aws-sdk/client-personalize"; import { personalizeClient } from "./libs/personalizeClients.js"; // Or, create the client here. // const personalizeClient = new PersonalizeClient({ region: "REGION"}); // Set the domain dataset group parameters. export const domainDatasetGroupParams = { name: 'NAME', /* required */ domain: 'DOMAIN' /* required for a domain dsg, specify ECOMMERCE or VIDEO_ON_DEMAND */ } export const run = async () => { try { const response = await personalizeClient.send(new CreateDatasetGroupCommand(domainDatasetGroupParams)); console.log("Success", response); return response; // For unit tests. } catch (err) { console.log("Error", err); } }; run();
-
使用以下
createDataset.js
代码在 Amazon Personalize 中创建物品交互数据集。更新createDatasetParam
,以指定您刚创建的数据集组和架构的 Amazon 资源名称 (ARN),为数据集命名,并对于datasetType
,指定Interactions
。// Get service clients module and commands using ES6 syntax. import { CreateDatasetCommand } from "@aws-sdk/client-personalize"; import { personalizeClient } from "./libs/personalizeClients.js"; // Or, create the client here. // const personalizeClient = new PersonalizeClient({ region: "REGION"}); // Set the dataset's parameters. export const createDatasetParam = { datasetGroupArn: 'DATASET_GROUP_ARN', /* required */ datasetType: 'DATASET_TYPE', /* required */ name: 'NAME', /* required */ schemaArn: 'SCHEMA_ARN' /* required */ } export const run = async () => { try { const response = await personalizeClient.send(new CreateDatasetCommand(createDatasetParam)); console.log("Success", response); return response; // For unit tests. } catch (err) { console.log("Error", err); } }; run();
-
使用以下
createDatasetImportJob.js
代码导入数据。更新datasetImportJobParam
以指定以下内容:-
为作业指定名称并指定交互数据集的 ARN。
-
对于
dataLocation
,指定您存储训练数据的 Amazon S3 存储桶路径 (s3://
)。bucket name
/folder name
/ratings.csv -
对于
roleArn
,为 Amazon Personalize 服务角色指定 Amazon 资源名称。您在入门先决条件期间创建了此角色。
// Get service clients module and commands using ES6 syntax. import {CreateDatasetImportJobCommand } from "@aws-sdk/client-personalize"; import { personalizeClient } from "./libs/personalizeClients.js"; // Or, create the client here. // const personalizeClient = new PersonalizeClient({ region: "REGION"}); // Set the dataset import job parameters. export const datasetImportJobParam = { datasetArn: 'DATASET_ARN', /* required */ dataSource: { /* required */ dataLocation: 'S3_PATH' }, jobName: 'NAME',/* required */ roleArn: 'ROLE_ARN' /* required */ } export const run = async () => { try { const response = await personalizeClient.send(new CreateDatasetImportJobCommand(datasetImportJobParam)); console.log("Success", response); return response; // For unit tests. } catch (err) { console.log("Error", err); } }; run();
-
数据集导入作业完成后,您就可以创建推荐器了。要创建推荐器,请使用以下 createRecommender.js
代码。使用以下内容更新 createRecommenderParam
:为推荐器指定名称,指定数据集组的 ARN,然后对于 recipeArn
,指定 arn:aws:personalize:::recipe/aws-vod-top-picks
。
// Get service clients module and commands using ES6 syntax. import { CreateRecommenderCommand } from "@aws-sdk/client-personalize"; import { personalizeClient } from "./libs/personalizeClients.js"; // Or, create the client here. // const personalizeClient = new PersonalizeClient({ region: "REGION"}); // Set the recommender's parameters. export const createRecommenderParam = { name: 'NAME', /* required */ recipeArn: 'RECIPE_ARN', /* required */ datasetGroupArn: 'DATASET_GROUP_ARN' /* required */ } export const run = async () => { try { const response = await personalizeClient.send(new CreateRecommenderCommand(createRecommenderParam)); console.log("Success", response); return response; // For unit tests. } catch (err) { console.log("Error", err); } }; run();
创建推荐器后,您可以使用它来获取建议。使用以下 getRecommendations.js
代码为用户获取建议。更新 getRecommendationsParam
,以指定您在上一步中创建的推荐器的 ARN,并指定用户 ID(例如 123
)。
// Get service clients module and commands using ES6 syntax. import { GetRecommendationsCommand } from "@aws-sdk/client-personalize-runtime"; import { personalizeRuntimeClient } from "./libs/personalizeClients.js"; // Or, create the client here. // const personalizeRuntimeClient = new PersonalizeRuntimeClient({ region: "REGION"}); // Set the recommendation request parameters. export const getRecommendationsParam = { recommenderArn: 'RECOMMENDER_ARN', /* required */ userId: 'USER_ID', /* required */ numResults: 15 /* optional */ } export const run = async () => { try { const response = await personalizeRuntimeClient.send(new GetRecommendationsCommand(getRecommendationsParam)); console.log("Success!", response); return response; // For unit tests. } catch (err) { console.log("Error", err); } }; run();