Creating objects in the Amazon Glue Data Catalog - Amazon Lake Formation
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Creating objects in the Amazon Glue Data Catalog

Amazon Lake Formation uses the Amazon Glue Data Catalog (Data Catalog) to store metadata about data lakes, data sources, transforms, and targets. Metadata is data about the underlying data in your dataset. Each Amazon account has one Data Catalog per Amazon Region.

Metadata in the Data Catalog is organized in a three-level data hierarchy comprising catalogs, databases, and tables. It organizes data from various sources into logical containers called catalogs. Each catalog represents data from sources like Amazon Redshift data warehouses, Amazon DynamoDB databases, and third-party data sources such as Snowflake, MySQL, and over 30 external data sources, which are integrated through federated connectors. You can also create new catalogs in the Data Catalog to store data in S3 Table Buckets or Redshift Managed Storage (RMS).

Tables store information about the underlying data, including schema information, partition information, and data location. Databases are collections of tables. The Data Catalog also contains resource links, which are links to shared catalogs, databases and tables in external accounts, and are used for cross-account access to data in the data lake.

The Data Catalog is a nested catalog object that contains catalogs, databases and tables. It is referenced by the Amazon Web Services account ID, and is the default catalog in an account and an Amazon Web Services Region. The Data Catalog uses a three-level hierarchy (catalog.database.table) to organize tables.

  • Catalog – The top-most level of Data Catalog’s three level metadata hierarchy. You can add multiple catalogs in a Data Catalog through federation.

  • Database – The second level of the metadata hierarchy comprising of tables and views. A database is also referred to as a schema in many data systems like Amazon Redshift and Trino.

  • Table and view – The third-level of the Data Catalog's 3-level data hierarchy.

All Iceberg tables in Amazon S3 are stored in the default Data Catalog having Catalog ID = Amazon Web Services account ID. You can create federated catalogs in Amazon Glue Data Catalog that store definitions of tables in Amazon Redshift, Amazon S3 Table storage, or other third-party data sources through federation.