Entropy - Amazon Glue
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Entropy

Checks whether the entropy value of a column matches a given expression. Entropy measures the level of information that's contained in a message. Given the probability distribution over values in a column, entropy describes how many bits are required to identify a value.

Syntax

Entropy <COL_NAME> <EXPRESSION>
  • COL_NAME – The name of the column that you want to evaluate the data quality rule against.

    Supported column types: Any column type

  • EXPRESSION – An expression to run against the rule type response in order to produce a Boolean value. For more information, see Expressions.

Example: Column entropy

The following example rule checks that the column named Feedback has an entropy value greater than one.

Entropy "Star_Rating" > 1 Entropy "First_Name" > 1 where "Customer_ID < 10"

Sample dynamic rules

  • Entropy "colA" < max(last(10))

  • Entropy "colA" between min(last(10)) and max(last(10))