Importing more training data into datasets - Amazon Personalize
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Importing more training data into datasets

As your catalog grows, import additional training data into your datasets. This helps maintain and improve the relevance of Amazon Personalize recommendations. You can import more data with bulk or individual data import operations.

If you want to update an existing dataset to add additional columns of data, you can replace the dataset's schema with a new one that has the added columns. Then you can import the new columns of data. For more information, see Replacing a dataset's schema to add new columns.

Importing data with individual import operations

After you import data into an Amazon Personalize dataset, you can update it by importing additional individual records, including item interactions, action interactions, users, items, or actions. Importing data individually allows you to add small batches of records to your Amazon Personalize datasets as your catalog grows.

When you import records individually, Amazon Personalize appends the new records to the dataset. To update an individual item, user, or action, you can import a record with the same ID but with the modified attributes. You can import up to 10 records per individual import operation.

For more information on importing records individually, see Importing individual records. For information about recording real-time events, see Recording real-time events to influence recommendations.

Updating existing bulk data

If you previously created a dataset import job for a dataset, you add to or replace bulk data by creating another import job. By default, a dataset import job replaces any existing data in the dataset that you imported in bulk. You can instead append the new records to existing data by changing the job's import mode.

The following are guidelines and requirements for updating bulk data:

  • To append data to an Item interactions dataset or Action interactions dataset with a dataset import job, you must have at minimum 1000 new item interaction or action interaction records.

  • If you already created a recommender or deployed a custom solution version with a campaign, how new bulk records influence recommendations depends on the domain use case or recipe that you use. For more information, see How new data influences real-time recommendations.

  • Within 20 minutes of completing a bulk import, Amazon Personalize updates any filters you created in the dataset group with your new bulk data. This update allows Amazon Personalize to use the most recent data when filtering recommendations for your users.

For more information about creating a dataset import job, see Importing bulk records with a dataset import job.