BUCKETIZATION
Bucketization (called Binning in the console) takes the items in a column of numeric values, groups them into bins defined by numeric ranges, and outputs a new column that displays the bin for each row. Bucketization can be done using splits or percentage. The first example below uses splits and the second example uses a percentage.
Parameters
-
sourceColumn
– The name of an existing column.targetColumn
– The name of the new column to be created.bucketNames
– List of bucket names.splits
– List of bucket levels. Buckets are consecutive, and an upper bound for a bucket will be a lower bound for the next bucket.percentage
– Each bucket will be described as a percentage.
Example using splits
{ "Action": { "Operation": "BUCKETIZATION", "Parameters": { "sourceColumn": "level", "targetColumn": "bin", "bucketNames": "[\"Bin1\",\"Bin2\",\"Bin3\"]", "splits": "[\"-Infinity\",\"2\",\"20\",\"Infinity\"]" } } }
Example using a percentage
{ "Action": { "Operation": "BUCKETIZATION", "Parameters": { "sourceColumn": "level", "targetColumn": "bin", "bucketNames": "[\"Bin1\",\"Bin2\"]", "percentage": "50" } } }