Getting batch user segments with custom resources
To get user segments, you use a batch segment job. A batch segment job is a tool that imports your batch input data from an Amazon S3 bucket and uses your solution version trained with a USER_SEGMENTATION recipe to generate user segments for each row of input data.
Depending on the recipe, the input data is a list of items or item metadata attributes in JSON format. For item attributes, your input data can include expressions to create user segments based on multiple metadata attributes. A batch segment job exports user segments to an output Amazon S3 bucket. Each user segment is sorted in descending order based on the probability that each user will interact with the item in your input data.
When generating user segments, Amazon Personalize considers data in datasets from bulk and individual imports:
-
For bulk data, Amazon Personalize generates segments using only the bulk data present at the last full solution version training. And it uses only bulk data that you imported with an import mode of FULL (replacing existing data).
-
For data from individual data import operations, Amazon Personalize generates user segments using the data present at the last full solution version training. To have newer records impact user segments, create a new solution version and then create a batch segment job.
Generating user segments works as follows:
-
Prepare and upload your input data in JSON format to an Amazon S3 bucket. The format of your input data depends on the recipe you use and the job you are creating. See Preparing input data for user segments.
-
Create a separate location for your output data, either a different folder or a different Amazon S3 bucket.
-
Create a batch segment job. See Getting user segments with a batch segment job.
-
When the batch segment job is complete, retrieve the user segments from your output location in Amazon S3.
Topics
Guidelines and requirements for getting user segments
The following are guidelines and requirements for batch getting batch segments:
-
You must use a USER_SEGMENTATION recipe.
-
Your Amazon Personalize IAM service role needs permission to read and add files to your Amazon S3 buckets. For information on granting permissions, see Service role policy for batch workflows. For more information on bucket permissions, see User policy examples in the Amazon Simple Storage Service Developer Guide.
If you use Amazon Key Management Service (Amazon KMS) for encryption, you must grant Amazon Personalize and your Amazon Personalize IAM service role permission to use your key. For more information, see Giving Amazon Personalize permission to use your Amazon KMS key.
-
You must create a custom solution and solution version before you create a batch inference job. However, you don't need to create an Amazon Personalize campaign. If you created a Domain dataset group, you can still create custom resources.
-
Your input data must be formatted as described in Preparing input data for user segments.
-
If you use the Item-Attribute-Affinity recipe, the attributes in your input data can't include unstructured textual item metadata, such as a product description.
-
If you use a filter with placeholder parameters, you must include the values for the parameters in your input data in a
filterValues
object. For more information, see Providing filter values in your input JSON. -
We recommend that you use a different location for your output data (either a folder or a different Amazon S3 bucket) than your input data.