Viewing column statistics - Amazon Glue
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Viewing column statistics

After generating the statistics successfully, Data Catalog stores this information for the cost-based optimizers in Amazon Athena and Amazon Redshift to make optimal choices when running queries. The statistics varies based on the type of the column.

Amazon Web Services Management Console
To view column statistics for a table
  • After running column statistics task, the Column statistics tab on the Table details page shows the statistics for the table.

    The screenshot shows columns generated from the most recent run.

    The following statistics are available:

    • Column name: Column name used to generate statistics

    • Last updated: Data and time when the statistics were generated

    • Average length: Average length of values in the column

    • Distinct values: Total number of distinct values in the column. We estimate the number of distinct values in a column with 5% relative error.

    • Max value: The largest value in the column.

    • Min value: The smallest value in the column.

    • Max length: The length of the highest value in the column.

    • Null values: The total number of null values in the column.

    • True values: The total number of true values in the column.

    • False values: The total number of false values in the column.

    • numFiles: The total number of files in the table. This value is available under the Advanced properties tab.

Amazon CLI

The following example shows how to retrieve column statistics using Amazon CLI.

aws glue get-column-statistics-for-table \ --database-name <test_db> \ --table-name <test_tble> \ --column-names <col1>

You can also view the column statistics using the GetColumnStatisticsForTable API operation.