Configuring columns used when training - Amazon Personalize
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Configuring columns used when training

Important

After you create a solution, you can’t change its configuration. By default, all new solutions use automatic training. With automatic training, you incur training costs while your solution is active. To avoid unnecessary costs, make sure to delete the solution when you are finished. For information about training costs, see Amazon Personalize pricing.

If your recipe generates item recommendations or user segments, you can modify the columns Amazon Personalize considers when creating a solution version (training a model). By default, Amazon Personalize uses all columns that can be used when training. Columns with the boolean data type and custom string fields that aren't categorical or textual aren't used when training. You can't exclude EVENT_TYPE columns.

You can change the columns used when training to control what data Amazon Personalize uses when training a model (creating a solution version). You might do this to experiment with different combinations of training data. Or you might exclude columns without meaningful data. For example, might have a column that you want to use only to filter recommendations. You can exclude this column from training and Amazon Personalize considers it only when filtering.

If you have already created a solution and you want to modify the columns it uses when training, you can clone the solution. When you clone a solution, you can use the configuration of the existing solution as a starting point, such as the recipe and hyperparameters, and make any changes as necessary. For more information, see Cloning a solution (console).

You can configure the columns Amazon Personalize uses when training with the Amazon Personalize console, Amazon Command Line Interface (Amazon CLI), or Amazon SDK. For information about choosing columns with the Amazon Personalize console, see the advanced configuration steps in Creating a solution (console). After you create a solution, you can view the columns the solution uses on the solution's details page of the Amazon Personalize console, or with the DescribeSolution operation.

Configuring columns used when training (Amazon CLI)

To exclude columns from training, provide the excludedDatasetColumns object in the trainingDataConfig as part of the solution configuration. For each key, provide the dataset type. For each value, provide the list of columns to exclude. The following code shows how to exclude columns from training when you create a solution with the Amazon CLI.

aws personalize create-solution \ --name solution name \ --dataset-group-arn dataset group ARN \ --recipe-arn recipe ARN \ --solution-config "{\"trainingDataConfig\": {\"excludedDatasetColumns\": { \"datasetType\" : [ \"column1Name\", \"column2Name\"]}}}"

Configuring columns used when training (Amazon SDKs)

To exclude columns from training, provide the excludedDatasetColumns object in the trainingDataConfig as part of the solution configuration. For each key, provide the dataset type. For each value, provide the list of columns to exclude. The following code shows how to exclude columns from training when you create a solution with the SDK for Python (Boto3).

import boto3 personalize = boto3.client('personalize') create_solution_response = personalize.create_solution( name = 'solution name', recipeArn = 'recipe ARN', datasetGroupArn = 'dataset group ARN', solutionConfig = { "trainingDataConfig": { "excludedDatasetColumns": { "datasetType": ["COLUMN_A", "COLUMN_B"] } } } ) solution_arn = create_solution_response['solutionArn'] print('solution_arn: ', solution_arn)