Optimize queries with Amazon Glue partition indexing and filtering
When Athena queries partitioned tables, it retrieves and filters the available table partitions to the subset relevant to your query. As new data and partitions are added, more time is required to process the partitions, and query runtime can increase. If you have a table with a large number of partitions that grows over time, consider using Amazon Glue partition indexing and filtering. Partition indexing allows Athena to optimize partition processing and improve query performance on highly partitioned tables. Setting up partition filtering in a table's properties is a two-step process:
-
Creating a partition index in Amazon Glue.
-
Enabling partition filtering for the table.
Creating a partition index
For steps on creating a partition index in Amazon Glue, see Working with partition indexes in the Amazon Glue Developer Guide. For the limitations on partition indexes in Amazon Glue, see the About partition indexes section on that page.
Enabling partition filtering
To enable partition filtering for the table, you must set a new table property in
Amazon Glue. For steps on how to set table properties in Amazon Glue, refer to the Setting up partition projection
-
For Key, add
partition_filtering.enabled
-
For Value, add
true
You can disable partition filtering on this table at any time by setting the
partition_filtering.enabled
value to false
.
After you complete the above steps, you can return to the Athena console to query the data.
For more information about using partition indexing and filtering, see Improve Amazon Athena query performance using Amazon Glue Data Catalog partition indexes