Accessing the Data Catalog - Amazon Glue
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Accessing the Data Catalog

You can use the Amazon Glue Data Catalog to discover and understand your data. Data Catalog provides a consistent way to maintain schema definitions, data types, locations, and other metadata. You can access the Data Catalog using the following methods:

  • Amazon Glue console – You can access and manage the Data Catalog through the Amazon Glue console, a web-based user interface. The console allows you to browse and search for databases, tables, and their associated metadata, as well as create, update, and delete metadata definitions.

  • Amazon Glue crawler – Crawlers are programs that automatically scan your data sources and populate the Data Catalog with metadata. You can create and run crawlers to discover and catalog data from various sources like Amazon S3, Amazon RDS, Amazon DynamoDB, Amazon CloudWatch, and JDBC-compliant relational databases such as MySQL, and PostgreSQL as well as several non-Amazon sources such as Snowflake and Google BigQuery.

  • Amazon Glue APIs – You can access the Data Catalog programmatically using the Amazon Glue APIs. These APIs allow you to interact with the Data Catalog programmatically, enabling automation and integration with other applications and services.

  • Amazon Command Line Interface (Amazon CLI) – You can use the Amazon CLI to access and manage the Data Catalog from the command line. The CLI provides commands for creating, updating, and deleting metadata definitions, as well as querying and retrieving metadata information.

  • Integration with other Amazon services – The Data Catalog integrates with various other Amazon services, allowing you to access and utilize the metadata stored in the catalog. For example, you can use Amazon Athena to query data sources using the metadata in the Data Catalog, and use Amazon Lake Formation to manage data access and governance for the Data Catalog resources.