Use Amazon Glue partition indexing and filtering - Amazon Athena
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Use Amazon Glue partition indexing and filtering

When Athena queries partitioned tables, it retrieves and filters the available table partitions to the subset relevant to your query. As new data and partitions are added, more time is required to process the partitions, and query runtime can increase. If you have a table with a large number of partitions that grows over time, consider using Amazon Glue partition indexing and filtering. Partition indexing allows Athena to optimize partition processing and improve query performance on highly partitioned tables. Setting up partition filtering in a table's properties is a two-step process:

  1. Creating a partition index in Amazon Glue.

  2. Enabling partition filtering for the table.

Creating a partition index

For steps on creating a partition index in Amazon Glue, see Working with partition indexes in the Amazon Glue Developer Guide. For the limitations on partition indexes in Amazon Glue, see the About partition indexes section on that page.

Enabling partition filtering

To enable partition filtering for the table, you must set a new table property in Amazon Glue. For steps on how to set table properties in Amazon Glue, refer to the Setting up partition projection page. When you edit the table details in Amazon Glue, add the following key-value pair to the Table properties section:

  • For Key, add partition_filtering.enabled

  • For Value, add true

You can disable partition filtering on this table at any time by setting the partition_filtering.enabled value to false.

After you complete the above steps, you can return to the Athena console to query the data.

For more information about using partition indexing and filtering, see Improve Amazon Athena query performance using Amazon Glue Data Catalog partition indexes in the Amazon Big Data Blog.