Data science recipe steps - Amazon Glue DataBrew
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Data science recipe steps

Use these recipe steps to tabulate and summarize data from different perspectives, or to perform advanced transformations.

SCALE

Scales or normalizes the range of data in a numeric column.

Parameters
  • sourceColumn – The name of an existing column.

  • strategy – The operation to be applied to the column values:

    • MIN_MAX – Rescales the values into a range of [0,1]

    • SCALE_BETWEEN – Rescales the values into a range of 2 specified values.

    • MEAN_NORMALIZATION – Rescales the data to have a mean (μ) of 0 and standard deviation (σ) of 1 within a range of [-1, 1]

    • Z_SCORE – Linearly scale data values to have a mean (μ) of 0 and standard deviation (σ) of 1. Best for handling outliers.

  • targetColumn – The name of a column to contain the results.

Example

{ "Action": { "Operation": "NORMALIZATION", "Parameters": { "sourceColumn": "all_votes", "strategy": "MIN_MAX", "targetColumn": "all_votes_normalized" } } }