Item interactions dataset - Amazon Personalize
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Item interactions dataset

An item interaction is a positive interaction event between a user and an item in your catalogue. For example, a user watching a movie, viewing a listing, or purchasing a pair of shoes. You import data about your users' interactions with your items into a Item interactions dataset. You can record multiple event types, such as click, watch or like.

For example, if a user clicks a particular item and then likes the item, you can have Amazon Personalize use these events as training data. For each event, you would record the user's ID, the item's ID, the timestamp (in Unix time epoch format), and the event type (click and like). You would then add both item interaction events to an Item interactions dataset.

For all use cases (Domain dataset groups) and recipes (custom resources), your item interactions data must have the following:

  • At minimum 1000 item interactions records from users interacting with items in your catalog. These interactions can be from bulk imports, or streamed events, or both.

  • At minimum 25 unique user IDs with at least two item interactions for each.

For quality recommendations, we recommend that you have at minimum 50,000 item interactions from at least 1,000 users with two or more item interactions each.

To create a recommender or a custom solution, you must at minimum create an Item interactions dataset. This section provides information about the following types of item interactions data you can import into Amazon Personalize.

Event type and event value data

An Item interactions dataset can store event type and event value data for each interaction. Only custom resources use event value data.

Event type data

Amazon Personalize uses event type data, such as click or purchase data, to identify user intent and interest. If you create domain recommenders, all use cases require event type data. Some use cases require specific event types. You are free to use additional event types. For more information see Choosing a use case.

If you create custom resources, you can choose the events used for training by event type. If your dataset has multiple event types in an EVENT_TYPE column, and you do not provide an event type when you configure a custom solution, Amazon Personalize uses all item interactions data for training with equal weight regardless of type. For more information, see Choosing the item interaction data used for training.

Positive and negative event types

Amazon Personalize assumes any interaction is a positive one. Interactions with a negative event type, such as dislike, won't necessarily keep the item from appearing in the user's future recommendations.

The following are ways to have negative events and users' disinterest influence recommendations:

Event value data (custom resources)

Event value data might be the percentage of a movie that a user watched or a rating out of 10. If you create custom solutions and import event value data along with event type data, you can choose records used for training based on type and value. With domain recommenders, Amazon Personalize doesn't use event value data and you can't filter events before training.

To choose records based on type and value, record an event type and event value for each event. The value you choose for each event depends on what data you want to exclude and what event types you are recording. For example, you might match the user activity, such as the percentage of video the user watched for watch event types.

When you configure a solution, you set a specific value as a threshold to exclude records from training. For example, if your EVENT_VALUE data for events with an EVENT_TYPE of watch is the percentage of a video that a user watched, if you set the event value threshold to 0.5, and the event type to watch, Amazon Personalize trains the model using only watch interaction events with an EVENT_VALUE greater than or equal to 0.5.

For more information, see Choosing the item interaction data used for training

Contextual metadata

With certain recipes and recommender use cases, Amazon Personalize can use contextual metadata when identifying underlying patterns that reveal the most relevant items for your users. Contextual metadata is interactions data you collect on the user's environment at the time of an event, such as their location or device type.

Including contextual metadata allows you to provide a more personalized experience for existing users. For example, if customers shop differently when accessing your catalog from a phone compared to a computer, include contextual metadata about the user's device. Recommendations will then be more relevant based on how they are browsing.

Additionally, contextual metadata helps decrease the cold-start phase for new or unidentified users. The cold-start phase refers to the period when your recommendation engine provides less relevant recommendations due to the lack of historical information regarding that user.

For Domain dataset groups, the following recommender use cases can use contextual metadata:

For custom resources, recipes that use contextual metadata include the following:

For more information on contextual information, see the following Amazon Machine Learning Blog post: Increasing the relevance of your Amazon Personalize recommendations by leveraging contextual information.

Impressions data

If you use a domain use case that provides personalization or the User-Personalization recipe, Amazon Personalize can model impressions data that you upload to an Item interactions dataset. Impressions are lists of items that were visible to a user when they interacted with (for example, clicked or watched) a particular item.

Amazon Personalize uses impressions data to determine what items to include in exploration. With exploration, recommendations include some items or actions that would be typically less likely to be recommended for the user, such as new items or actions, items or actions with few interactions, or items or actions less relevant for the user based on their previous behavior. The more frequently an item occurs in impressions data, the less likely it is that Amazon Personalize includes the item in exploration. Impression values can have at most 1000 characters (including the vertical bar character).

For Domain dataset groups, the following recommender use cases can use impressions data:

For more information about exploration see Exploration. Amazon Personalize can model two types of impressions: Implicit impressions and Explicit impressions.

Implicit impressions

Implicit impressions are the recommendations, retrieved from Amazon Personalize, that you show the user. You can integrate them into your recommendation workflow by including the RecommendationId (returned by the GetRecommendations and GetPersonalizedRanking operations) as input for future PutEvents requests. Amazon Personalize derives the implicit impressions based on your recommendation data.

For example, you might have an application that provides recommendations for streaming video. Your recommendation workflow using implicit impressions might be as follows:

  1. You request video recommendations for one of your users using the Amazon Personalize GetRecommendations API operation.

  2. Amazon Personalize generates recommendations for the user using your model (solution version) and returns them with a recommendationId in the API response.

  3. You show the video recommendations to your user in your application.

  4. When your user interacts with (for example, clicks) a video, record the choice in a call to the PutEvents API and include the recommendationId as a parameter. For a code sample see Recording impressions data.

  5. Amazon Personalize uses the recommendationId to derive the impression data from the previous video recommendations, and then uses the impression data to guide exploration, where future recommendations include new videos with less interactions data or relevance.

    For more information on recording events with implicit impression data, see Recording impressions data.

Explicit impressions

Explicit impressions are impressions that you manually record and send to Amazon Personalize. Use explicit impressions to manipulate results from Amazon Personalize. The order of the items has no impact.

For example, you might have a shopping application that provides recommendations for shoes. If you only recommend shoes that are currently in stock, you can specify these items using explicit impressions. Your recommendation workflow using explicit impressions might be as follows:

  1. You request recommendations for one of your users using the Amazon Personalize GetRecommendations API.

  2. Amazon Personalize generates recommendations for the user using your model (solution version) and returns them in the API response.

  3. You show the user only the recommended shoes that are in stock.

  4. For real-time incremental data import, when your user interacts with (for example, clicks) a pair of shoes, you record the choice in a call to the PutEvents API and list the recommended items that are in stock in the impression parameter. For a code sample see Recording impressions data.

    For importing impressions in historical item interactions data, you can list explicit impressions in your csv file and separate each item with a '|' character. The vertical bar character counts towards the 1000 character limit. For an example see Formatting explicit impressions.

  5. Amazon Personalize uses the impression data to guide exploration, where future recommendations include new shoes with less interactions data or relevance.