Frequently asked questions - Amazon Personalize
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Frequently asked questions

The following are answers to frequently asked questions related to importing data, training, model deployment, recommendations, and filters in Amazon Personalize.

For more questions and answers, see the Amazon Personalize Cheat Sheet in the Amazon Personalize samples repository.

Data import and management

What format should my bulk data be in?

Your bulk data must be in comma-separated values (CSV) format. The first row of your CSV file must contain column headers. The column headers in your CSV file need to map to the schema to create the dataset. If your data includes any non-ASCII encoded characters, your CSV file must be encoded in UTF-8 format. Don't enclose headers in quotation marks ("). TIMESTAMP and CREATION_TIMESTAMP data must be in UNIX epoch time format. For more information on timestamp data, see Timestamp data. For more information about schemas, see Schemas.

For complete data format guidelines, see Data format guidelines. If you're not sure how to format your data, you can use Amazon SageMaker Data Wrangler (Data Wrangler) to prepare your data. For more information, see Preparing and importing data using Amazon SageMaker Data Wrangler.

How much training data do I need?

For all use cases (Domain dataset groups) and custom recipes, your interactions data must have the following:

  • At minimum 1000 item interactions records from users interacting with items in your catalog. These interactions can be from bulk imports, or streamed events, or both.

  • At minimum 25 unique user IDs with at least two item interactions for each.

For quality recommendations, we recommend that you have at minimum 50,000 item interactions from at least 1,000 users with two or more item interactions each.

You can start out with an empty Item interactions dataset and, when you have recorded enough data, create your recommender (Domain dataset group) or custom solution version using only new recorded events. Some recipes and use cases may have additional data requirements. For information on use case requirements, see Choosing a use case. For information on recipe requirements, see Choosing a recipe.

How do I update an item or user's attributes?

Use the Amazon Personalize console or the PutItems or PutUsers operations to import an item or user with the same item ID but with the modified attributes.

How do I delete an item or user?

Amazon Personalize doesn't support deleting a specific item or user. To make sure that an item or user doesn't appear in recommendations, use a filter to exclude items. For more information, see Filtering recommendations and user segments.

How do I delete a schema?

You can delete a schema only with the DeleteSchema operation. You can't use the Amazon Personalize console to delete a schema.

Creating a custom solution and solution version

What recipe should I use?

The Amazon Personalize recipe that you use depends on your use case. For information on matching use cases to recipes, see Choosing a recipe. The Amazon Personalize Cheat Sheet also includes use case and recipe information.

How often should I train?

We recommend using automatic training with at least a weekly training frequency. Automatic training makes it easier for you to maintain recommendation relevance. Your training frequency depends on your business requirements, the recipe that you use, and how frequently you import data. For more information, see Configuring automatic training. For information about maintaining relevance, see Maintaining recommendation relevance.

Should I use AutoML?

No, instead we recommend that you match your use case to different Amazon Personalize recipes and choose a recipe. For information on matching use cases to recipes, see Choosing a recipe.

Model deployment (custom campaigns)

What should I set for my campaign's minProvisionedTPS?

A high minProvisionedTPS will increase your cost. We recommend starting with 1 for minProvisionedTPS (the default). Track your usage using Amazon CloudWatch metrics, and increase the minProvisionedTPS as necessary.

How do I monitor the cost of my campaigns?

The Amazon Personalize Monitor project provides a CloudWatch dashboard, custom metrics, utilization alarms, and cost optimization functions for Amazon Personalize campaigns. See the Amazon Personalize Monitor in the Amazon Personalize samples repository.

How do I set a maximum transaction throughput for a campaign?

You can only set the minimum throughput for a campaign. When you create an Amazon Personalize campaign, you specify a dedicated transaction capacity for creating real-time recommendations for your application users. If your TPS increases beyond minProvisionedTPS, Amazon Personalize auto-scales the provisioned capacity up and down, but never below the minProvisionedTPS. For more information, see Minimum provisioned transactions per second and auto-scaling.

Recommendations

How can I tell if my Amazon Personalize model is generating quality recommendations?

Evaluate the performance of your solution version with offline and online metrics (see Evaluating an Amazon Personalize solution version with metrics) and online testing (such as A/B testing). For more information about A/B testing, see Measuring recommendation impact with A/B testing.

How do I delete my batch inference job and why is its status "active"?

You can't delete batch inference jobs. When a batch inference job's status is active, the job is complete. You can access your recommendations in the output Amazon S3 bucket or folder. You won't incur additional cost from the batch inference job once the job is complete. However you may incur additional charges from other services such as Amazon S3 for input and output data storage.

Why does my SIMS-backed campaign recommend items that are not similar based on metadata?

SIMS uses your Item interactions dataset to determine similarity; not item metadata such as color or price. SIMS identifies the co-occurrence of the item in user histories in your Interaction dataset to recommend similar items. For more information, see SIMS recipe.

Can I get more than 500 items from a single GetRecommendations API operation?

500 is the maximum number of items that you can retrieve in a single GetRecommendations. This value cannot be increased.

Filtering recommendations

Why aren't my recommendations filtered as expected?

This can occur for a variety of reasons:

  • There may be issue with the format or syntax of your filter expression. For examples of correctly formatted filter expressions, see Filter expression examples.

  • Amazon Personalize considers up to 100 of the most recent interactions per user per event type. This is an adjustable quota. You can request a quota increase using the Service Quotas console. If you don't import item interactions for a user for three months, your filters no longer consider the user's historical data. To consider this data, you must import the user's entire event history again.

For more information, see Filtering recommendations and user segments.

How can I remove already purchased items from recommendations?

For ECOMMERCE Domain dataset groups, if you create a recommender with the Recommended for you or Customers who viewed X also viewed use case, Amazon Personalize automatically filters items the user purchased based on the userId that you specify and Purchase events.

For other Domain dataset group use cases or custom resources, use a filter to remove purchased items. Add a Purchased event type attribute to your data, record Purchase events with the PutItems operation, and create a filter that removes purchased items from recommendations. For example:

EXCLUDE ItemID WHERE Interactions.EVENT_TYPE IN ("purchased")

For more information, see Filtering recommendations and user segments.