Updating column statistics - Amazon Glue
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Updating column statistics

Keeping statistics current improves query performance by enabling the query planner to choose optimal plans. You need to explicitly run the Generate statistics task from the Amazon Glue console to refresh the column statistics. Data Catalog doesn't automatically refresh the statistics.

If you are not using Amazon Glue's statistics generation feature in the console, you can manually update column statistics using the UpdateColumnStatisticsForTable API operation or Amazon CLI. The following example shows how to update column statistics using Amazon CLI.

aws glue update-column-statistics-for-table --cli-input-json: { "CatalogId": "111122223333", "DatabaseName": "test_db", "TableName": "test_table", "ColumnStatisticsList": [ { "ColumnName": "col1", "ColumnType": "Boolean", "AnalyzedTime": "1970-01-01T00:00:00", "StatisticsData": { "Type": "BOOLEAN", "BooleanColumnStatisticsData": { "NumberOfTrues": 5, "NumberOfFalses": 5, "NumberOfNulls": 0 } } } ] }