User-Personalization recipe
The User-Personalization (aws-user-personalization) recipe is optimized for all personalized recommendation scenarios. It predicts the items that a user will interact with based on Interactions, Items, and Users datasets. When recommending items, it uses automatic item exploration.
With exploration, recommendations include some items that would be typically less likely to be recommended for the user, such as new items, items with few interactions, or items less relevant for the user based on their previous behavior. This improves item discovery and engagement when you have a fast-changing catalog, or when new items, such as news articles or promotions, are more relevant to users when fresh.
When you deploy your solution version with a campaign, you can balance how much to explore (where items with less interactions data or relevance are recommended more frequently) against how much to exploit (where recommendations are based on what we know or relevance).
Topics
Automatic updates
With User-Personalization, for real-time recommendations, Amazon Personalize automatically updates the latest model (solution version) every two hours behind the scenes to include new items in recommendations through exploration. For batch item recommendations, Amazon Personalize updates the solution version you specify in the batch inference job when the solution version is the latest for your solution.
With each update, Amazon Personalize updates the solution version to consider any new items through exploration. And it uses any new interactions data, including impressions data,
to determine what items to include or not include in exploration. This is not a full retraining;
you should still train a new solution version weekly with trainingMode
set to FULL
so the model can learn from your users' behavior and any item metadata.
There is no cost for automatic updates. For real-time recommendations, the solution version must be deployed with an Amazon Personalize campaign for updates to occur. Your campaign automatically uses the updated solution version. No new solution version is created when an auto update completes and no new model metrics are generated. This is because no full retraining occurs. If you create a new solution version, Amazon Personalize will not automatically update older solution versions, even if you have deployed them in a campaign. Updates also do not occur if you have deleted your dataset.
If every two hours is not frequent enough, you can manually create a solution version with trainingMode
set to UPDATE
to include those new items in recommendations. Just remember that Amazon Personalize automatically updates only your latest fully trained solution version, so the manually updated solution version won't be
automatically updated in the future.
Automatic update requirements
Automatic update requirements for real time recommendations include the following:
-
You must deploy the solution version with a campaign (for more information see Creating a campaign). The campaign automatically uses the latest automatically updated solution version.
-
The solution version must be trained with
trainingMode
set toFULL
(this is the default when creating a solution version). You must provide new item or interactions data since the last automatic update.
Automatic update requirements for batch item recommendations include the following:
-
The solution version you specify in the batch inference job must be the latest solution version for your solution.
-
The solution version must be trained with
trainingMode
set toFULL
(this is the default when creating a solution version). You must provide new item or interactions data since the last automatic update.
Working with impressions data
Unlike other recipes, which solely use positive interactions (clicking, watching, or purchasing), the User-Personalization recipe can also use impressions data. Impressions are lists of items that were visible to a user when they interacted with (clicked, watched, purchased, and so on) a particular item.
Amazon Personalize uses impressions data to determine what items to include in exploration. The more frequently an item occurs in impressions data, the less likely it is that Amazon Personalize includes the item in exploration. Impressions data.
Properties and hyperparameters
The User-Personalization recipe has the following properties:
-
Name –
aws-user-personalization
-
Recipe Amazon Resource Name (ARN) –
arn:aws:personalize:::recipe/aws-user-personalization
-
Algorithm ARN –
arn:aws:personalize:::algorithm/aws-user-personalization
For more information, see Choosing a recipe.
The following table describes the hyperparameters for the User-Personalization recipe. A hyperparameter is an algorithm parameter that you can adjust to improve model performance. Algorithm hyperparameters control how the model performs. Featurization hyperparameters control how to filter the data to use in training. The process of choosing the best value for a hyperparameter is called hyperparameter optimization (HPO). For more information, see Hyperparameters and HPO.
The table also provides the following information for each hyperparameter:
-
Range: [lower bound, upper bound]
-
Value type: Integer, Continuous (float), Categorical (Boolean, list, string)
-
HPO tunable: Can the parameter participate in HPO?
Name | Description |
---|---|
Algorithm hyperparameters | |
hidden_dimension |
The number of hidden variables used in the model. Hidden variables recreate users'
purchase history and item statistics to generate ranking scores.
Specify a greater number of hidden dimensions when your Interactions
dataset includes more complicated patterns. Using more hidden
dimensions requires a larger dataset and more time to process. To
decide on the best value, use HPO. To use HPO, set
Default value: 149 Range: [32, 256] Value type: Integer HPO tunable: Yes |
bptt |
Determines whether to use the back-propagation through time
technique. Back-propagation through
time is a technique that updates weights in recurrent
neural network-based algorithms. Use Default value: 32 Range: [2, 32] Value type: Integer HPO tunable: Yes |
recency_mask |
Determines whether the model should consider the latest popularity
trends in the Interactions dataset. Latest popularity trends might
include sudden changes in the underlying patterns of interaction
events. To train a model that places more weight on recent events,
set Default value: Range: Value type: Boolean HPO tunable: Yes |
Featurization hyperparameters | |
min_user_history_length_percentile |
The minimum percentile of user history lengths to include in model
training. History length is the
total amount of data about a user. Use
For example, setting Default value: 0.0 Range: [0.0, 1.0] Value type: Float HPO tunable: No |
max_user_history_length_percentile |
The maximum percentile of user history lengths to include in model
training. History length is the
total amount of data about a user. Use
For example, setting Default value: 0.99 Range: [0.0, 1.0] Value type: Float HPO tunable: No |
Item exploration campaign configuration hyperparameters | |
exploration_weight |
Determines how frequently recommendations include items with less interactions data or relevance. The closer the value is to 1.0, the more exploration. At zero, no exploration occurs and recommendations are based on current data (relevance). For more information see CampaignConfig. Default value: 0.3 Range: [0.0, 1.0] Value type: Float HPO tunable: No |
exploration_item_age_cut_off |
Specify the maximum item age in days since the latest interaction across all items in the Interactions dataset. This defines the scope of item exploration based on item age. Amazon Personalize determines an item's age based on its creation timestamp or, if creation timestamp data is missing, interactions data. For more information how Amazon Personalize determines an item's age, see Creation timestamp data. To increase the items Amazon Personalize considers during exploration, enter a greater value. The minimum is 1 day and the default is 30 days. Recommendations might include items that are older than the item age cut off you specify. This is because these items are relevant to the user and exploration didn't identify them. Default value: 30.0 Range: Positive floats Value type: Float HPO tunable: No |
Training with the User-Personalization recipe (console)
To use the User-Personalization recipe to generate recommendations in the console, first train a new solution version using the recipe. Then deploy a campaign using the solution version and use the campaign to get recommendations.
Training a new solution version with the User-Personalization recipe (console)
-
Open the Amazon Personalize console at https://console.amazonaws.cn/personalize/home
and sign into your account. -
Create a Custom dataset group with a new schema and upload your dataset with impressions data. Optionally include CREATION_TIMESTAMP and Unstructured text metadata data in your Items dataset so Amazon Personalize can more accurately calculate the age of an item and identify cold items.
For more information on importing data, see Step 2: Preparing and importing data.
-
On the Dataset groups page, choose the new dataset group that contains the dataset or datasets with impressions data.
-
In the navigation pane, choose Solutions and recipes and choose Create solution.
-
On the Create solution page, for the Solution name, enter the name of your new solution.
-
For Solution type, choose Item recommendation to get item recommendations for your users.
-
For Recipe, choose aws-user-personalization. The Solution configuration section appears providing several configuration options.
-
In Solution configuration, if your Interactions dataset has EVENT_TYPE or both EVENT_TYPE and EVENT_VALUE columns, optionally use the Event type and Event value threshold fields to choose the interactions data that Amazon Personalize uses when training the model.
For more information see Choosing the interactions data used for training.
-
Optionally configure hyperparameters for your solution. For a list of User-Personalization recipe properties and hyperparameters, see Properties and hyperparameters.
-
Choose Create and train solution to start training. The Dashboard page displays.
You can navigate to the solution details page to track training progress in the Solution versions section. When training is complete, the status is Active.
Creating a campaign and getting recommendations (console)
When your solution version status is Active you are ready to create your campaign and get recommendations as follows:
-
On either the solution details page or the Campaigns page, choose Create new campaign.
-
On the Create new campaign page, for Campaign details, provide the following information:
-
Campaign name: Enter the name of the campaign. The text you enter here appears on the Campaign dashboard and details page.
-
Solution: Choose the solution that you just created.
-
Solution version ID: Choose the ID of the solution version that you just created.
-
Minimum provisioned transactions per second: Set the minimum provisioned transactions per second that Amazon Personalize supports. For more information, see the CreateCampaign operation.
-
-
For Campaign configuration, provide the following information:
-
Exploration weight: Configure how much to explore, where recommendations include items with less interactions data or relevance more frequently the more exploration you specify. The closer the value is to 1, the more exploration. At zero, no exploration occurs and recommendations are based on current data (relevance).
-
Exploration item age cut off: Enter the maximum item age, in days since the latest interaction, to define the scope of item exploration. To increase the number of items Amazon Personalize considers during exploration, enter a greater value.
For example, if you enter 10, only items with interactions data from the 10 days since the latest interaction in the dataset are considered during exploration.
Note
Recommendations might include items without interactions data from outside this time frame. This is because these items are relevant to the user's interests, and exploration wasn't required to identify them.
-
-
Choose Create campaign.
-
On the campaign details page, when the campaign status is Active, you can use the campaign to get recommendations and record impressions. For more information, see Step 5: Get recommendations in "Getting Started."
Amazon Personalize automatically updates your latest solution version every two hours to include new data. Your campaign automatically uses the updated solution version. For more information see Automatic updates.
To manually update the campaign, you first create and train a new solution version using the console or the CreateSolutionVersion operation, with
trainingMode
set toupdate
. You then manually update the campaign on the Campaign page of the console or by using the UpdateCampaign operation.Note
Amazon Personalize doesn't automatically update solution versions you created before November 17, 2020.
Training with the User-Personalization recipe (Python SDK)
When you have created a dataset group and uploaded your dataset(s) with impressions data, you can train a solution with the User-Personalization recipe. Optionally include CREATION_TIMESTAMP and Unstructured text metadata data in your Items dataset so Amazon Personalize can more accurately calculate the age of an item and identify cold items. For more information on creating dataset groups and uploading training data see Datasets and schemas.
To train a solution with the User-Personalization recipe using the Amazon SDK
-
Create a new solution using the
create_solution
method.Replace
solution name
with your solution name anddataset group arn
with the Amazon Resource Name (ARN) of your dataset group.import boto3 personalize = boto3.client('personalize') print('Creating solution') create_solution_response = personalize.create_solution(name = '
solution name
', recipeArn = 'arn:aws:personalize:::recipe/aws-user-personalization', datasetGroupArn = 'dataset group arn
', ) solution_arn = create_solution_response['solutionArn'] print('solution_arn: ', solution_arn)For a list of aws-user-personalization recipe properties and hyperparameters, see Properties and hyperparameters.
-
Create a new solution version with the updated training data and set
trainingMode
toFULL
using the following code snippet. Replace thesolution arn
with the ARN of your solution.import boto3 personalize = boto3.client('personalize') create_solution_version_response = personalize.create_solution_version(solutionArn = '
solution arn
', trainingMode='FULL') new_solution_version_arn = create_solution_version_response['solutionVersionArn'] print('solution_version_arn:', new_solution_version_arn) -
When Amazon Personalize is finished creating your solution version, create your campaign with the following parameters:
-
Provide a new
campaign name
and thesolution version arn
generated in step 2. -
Modify the
explorationWeight
item exploration configuration hyperparameter to configure how much to explore. Items with less interactions data or relevance are recommended more frequently the closer the value is to 1.0. The default value is 0.3. -
Modify the
explorationItemAgeCutOff
item exploration configuration hyperparameter parameter to provide the maximum duration, in days relative to the latest interaction, for which items should be explored. The larger the value, the more items are considered during exploration.
Use the following Python snippet to create a new campaign with an emphasis on exploration with exploration cut-off at 30 days. Creating a campaign usually takes a few minutes but can take over an hour.
import boto3 personalize = boto3.client('personalize') create_campaign_response = personalize.create_campaign( name = '
campaign name
', solutionVersionArn = 'solution version arn
', minProvisionedTPS = 1, campaignConfig = {"itemExplorationConfig": {"explorationWeight": "0.3
", "explorationItemAgeCutOff": "30
"}} ) campaign_arn = create_campaign_response['campaignArn'] print('campaign_arn:', campaign_arn)With User-Personalization, Amazon Personalize automatically updates your solution version every two hours to include new data. Your campaign automatically uses the updated solution version. For more information see Automatic updates.
To manually update the campaign, you first create and train a new solution version using the console or the CreateSolutionVersion operation, with
trainingMode
set toupdate
. You then manually update the campaign on the Campaign page of the console or by using the UpdateCampaign operation.Note
Amazon Personalize doesn't automatically update solution versions you created before November 17, 2020.
-
Getting recommendations and recording impressions (SDK for Python (Boto3))
When your campaign is created, you can use it to get recommendations for a user and record impressions. For information on getting batch recommendations using the Amazon SDKs see Creating a batch inference job (Amazon SDKs).
To get recommendations and record impressions
-
Call the
get_recommendations
method. Change thecampaign arn
to the ARN of your new campaign anduser id
to the userId of the user.import boto3 rec_response = personalize_runtime.get_recommendations(campaignArn = '
campaign arn
', userId = 'user id
') print(rec_response['recommendationId']) -
Create a new event tracker for sending PutEvents requests. Replace
event tracker name
with the name of your event tracker anddataset group arn
with the ARN of your dataset group.import boto3 personalize = boto3.client('personalize') event_tracker_response = personalize.create_event_tracker( name = '
event tracker name
', datasetGroupArn = 'dataset group arn
' ) event_tracker_arn = event_tracker_response['eventTrackerArn'] event_tracking_id = event_tracker_response['trackingId'] print('eventTrackerArn:{},\n eventTrackingId:{}'.format(event_tracker_arn, event_tracking_id)) -
Use the
recommendationId
from step 1 and theevent tracking id
from step 2 to create a newPutEvents
request. This request logs the new impression data from the user’s session. Change theuser id
to the ID of the user.import boto3 personalize_events.put_events( trackingId = '
event tracking id
', userId= 'user id
', sessionId = '1', eventList = [{ 'sentAt': datetime.now().timestamp(), 'eventType' : 'click', 'itemId' : rec_response['itemList'][0]['itemId'], 'recommendationId': rec_response['recommendationId
'], 'impression': [item['itemId'] for item in rec_response['itemList']], }] )
Sample Jupyter notebook
For a sample Jupyter notebook that shows how to use the User-Personalization recipe,
see User Personalization with Exploration