Getting batch recommendations - Amazon Personalize
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Getting batch recommendations

With custom resources, you can get item recommendations with an asynchronous batch flow. For example, you might get product recommendations for all users on an email list or item-to-item similarities across an inventory.

To get batch recommendations for items, you use a batch inference job. A batch inference job is a tool that imports your batch input data from an Amazon S3 bucket, uses your custom solution version to generate item recommendations, and then exports the item recommendations to an Amazon S3 bucket. Depending on the recipe, your input data is a list of users, or items, or a list of users each with a collection of items.

If your solution uses the Similar Items recipe and you have an Items dataset with textual data and item title data, you can generate batch recommendations with themes for each group of items. For more information, see Batch recommendations with themes from Content Generator.

When generating batch recommendations, Amazon Personalize considers all bulk data present at the time of latest solution version creation. This data can be imported with an import mode of FULL or INCREMENTAL. For newer bulk records to influence batch recommendations, you must create a new solution version and then create the batch inference job.

Amazon Personalize uses data from individual imports when generating batch recommendations as follows:

  • New interactions with existing items and users: If you use the User-Personalization or Personalized-Ranking recipes, Amazon Personalize considers new interactions data with existing items and users within about 15 minutes from data import. To make sure events are considered, we recommend you wait at minimum 15 minutes after import before you start a batch inference job. For all other recipes, you must create a new solution version for streamed events to influence batch recommendations.

  • New users: For users without interactions data, recommendations are initially for only popular items. If you use use User-Personalization or Personalized-Ranking and you record events for the user, their recommendations might become more relevant within about 15 minutes after import without retraining. To make sure events are considered, we recommend you wait at minimum 15 minutes after import before you start a batch inference job. For all other recipes, you must create a new solution version for streamed events to influence batch recommendations for users without interactions data.

  • New items: With User-Personalization, when you create a batch inference job and specify the latest fully trained solution version for your solution, Amazon Personalize automatically updates the solution version to include new items in recommendations with exploration. If you don't specify the latest solution version, no update occurs. For any other recipe, you must create a new solution version for new items to be featured in batch recommendations. For more information about exploration, see Exploration.

Batch workflow

The batch workflow is as follows:

  1. Prepare and upload your input data in JSON format to an Amazon S3 bucket. The format of your input data depends on the recipe you use. See Preparing input data for batch recommendations.

  2. Create a separate location for your output data, either a folder or a different Amazon S3 bucket.

  3. Create a batch inference job. See Creating a batch inference job.

  4. When the batch inference is complete, retrieve the item recommendations from your output location in Amazon S3.

Guidelines and requirements

The following are guidelines and requirements for getting batch recommendations:

  • Your Amazon Personalize IAM service role must have permission to read and add files to your Amazon S3 buckets. For information on granting permissions, see Service role policy for batch workflows. For more information on bucket permissions, see User policy examples in the Amazon Simple Storage Service Developer Guide. If you use Amazon Key Management Service (Amazon KMS) for encryption, you must grant Amazon Personalize and your Amazon Personalize IAM service role permission to use your key. For more information, see Giving Amazon Personalize permission to use your Amazon KMS key.

  • You must create a custom solution and solution version before you create a batch inference job. However, you don't need to create an Amazon Personalize campaign. If you created a Domain dataset group, you can still create custom resources.

  • To generate themes with recommendations, you must use the Similar-Items recipe. And you must have an Items dataset with textual data and item title data. For more information about themed recommendations, see Batch recommendations with themes from Content Generator.

  • Your input data must be formatted as described in Preparing input data for user segments.

  • You can't get batch recommendations with the Trending-Now or Next-Best-Action recipes.

  • If you use a filter with placeholder parameters, you must include the values for the parameters in your input data in a filterValues object. For more information, see Providing filter values in your input JSON.

  • We recommend that you use a different location for your output data (either a folder or a different Amazon S3 bucket) than your input data.

  • Batch recommendations might not be exactly the same as real-time recommendations. This is because batch inference jobs take longer to complete and only consider data available 15 minutes before the start of the job.

Batch workflow scoring

Batch recommendations include scores as follows: