Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions,
see Getting Started with Amazon Web Services in China
(PDF).
Viewing column statistics
After generating the statistics successfully, Data Catalog stores this information for the
cost-based optimizers in Amazon Athena and Amazon Redshift to make optimal choices when
running queries. The statistics varies based on the type of the column.
- Amazon Web Services Management Console
-
To view column statistics for a table
-
After running column statistics task, the Column statistics tab on the Table details page shows the statistics for the table.
The following statistics are available:
Column name: Column name used to generate statistics
Last updated: Data and time when the statistics were generated
Average length: Average length of values in the column
Distinct values: Total number of distinct values in the column. We estimate the number of
distinct values in a column with 5% relative error.
Max value: The largest value in the column.
Min value: The smallest value in the column.
Max length: The length of the highest value in the column.
Null values: The total number of null values in the column.
True values: The total number of true values in the column.
False values: The total number of false values in the column.
-
numFiles: The total number of files in the table. This
value is available under the Advanced properties tab.
- Amazon CLI
-
The following example shows how to retrieve column statistics using
Amazon CLI.
aws glue get-column-statistics-for-table \
--database-name database_name
\
--table-name table_name
\
--column-names <column_name>
You can also view the column statistics using the GetColumnStatisticsForTable API operation.