Updating column statistics
Keeping statistics current improves query performance by enabling the query planner to choose optimal plans. You need to explicitly run the Generate statistics task from the Amazon Glue console to refresh the column statistics. Data Catalog doesn't automatically refresh the statistics.
If you are not using Amazon Glue's statistics generation feature in the console, you can manually update column statistics using the UpdateColumnStatisticsForTable API operation or Amazon CLI. The following example shows how to update column statistics using Amazon CLI.
aws glue update-column-statistics-for-table --cli-input-json: { "CatalogId": "
111122223333
", "DatabaseName": "test_db
", "TableName": "test_table
", "ColumnStatisticsList": [ { "ColumnName": "col1
", "ColumnType": "Boolean", "AnalyzedTime": "1970-01-01T00:00:00", "StatisticsData": { "Type": "BOOLEAN", "BooleanColumnStatisticsData": { "NumberOfTrues": 5, "NumberOfFalses": 5, "NumberOfNulls": 0 } } } ] }