Use Amazon Glue Data Catalog to connect to your data
Athena uses the Amazon Glue Data Catalog to store metadata such as table and column names for your data stored in Amazon S3. This metadata information becomes the databases, tables, and views that you see in the Athena query editor.
When using Athena with the Amazon Glue Data Catalog, you can use Amazon Glue to create databases and tables (schema) to be queried in Athena, or you can use Athena to create schema and then use them in Amazon Glue and related services.
To define schema information for Amazon Glue, you can use a form in the Athena console, use the
query editor in Athena, or create an Amazon Glue crawler in the Amazon Glue console. Amazon Glue crawlers
automatically infer database and table schema from your data in Amazon S3. Using a form offers
more customization. Writing your own CREATE TABLE
statements requires more
effort, but offers the most control. For more information, see CREATE TABLE.
Additional Resources
-
For more information about the Amazon Glue Data Catalog, see Data Catalog and crawlers in Amazon Glue in the Amazon Glue Developer Guide.
-
For an illustrative article showing how to use Amazon Glue and Athena to process XML data, see Process and analyze highly nested and large XML files using Amazon Glue and Amazon Athena
in the Amazon Big Data Blog. -
Separate charges apply to Amazon Glue. For more information, see Amazon Glue pricing
.
Topics
- Register and use data catalogs in Athena
- Register a Data Catalog from another account
- Control access to data catalogs with IAM policies
- Use a form in the Athena console to add an Amazon Glue table
- Use a crawler to add a table
- Optimize queries with Amazon Glue partition indexing and filtering
- Use the Amazon CLI to recreate an Amazon Glue database and its tables
- Create tables for ETL jobs
- Work with CSV data in Amazon Glue
- Work with geospatial data in Amazon Glue