Creating an Amazon S3 Tables catalog in the Amazon Glue Data Catalog - Amazon Lake Formation
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Creating an Amazon S3 Tables catalog in the Amazon Glue Data Catalog

Amazon S3 Tables provide S3 storage that's specifically optimized for analytics workloads, improving query performance while reducing costs. The data in S3 Tables is stored in a new bucket type: a table bucket, which stores tables as subresources. S3 tables have built-in support for Apache Iceberg standard, which allows you to easily query tabular data in Amazon S3 table buckets using popular query engines like Apache Spark.

You can integrate Amazon S3 table buckets and tables with Amazon Glue Data Catalog (Data Catalog), and register the catalog as a Lake Formation data location from the Lake Formation console or using service APIs.

For more information, see Using Amazon S3 Tables with Amazon analytics services in the Amazon Simple Storage Service User Guide.

How Data Catalog and Lake Formation integration works

When you integrate the S3 tables catalog with the Data Catalog and Lake Formation, the Amazon Glue service creates a single federated catalog called s3tablescatalog in your account's default Data Catalog specific to your Amazon Web Services Region. The integration maps all Amazon S3 table bucket resources in your account and Amazon Web Services Region under the federated catalog in the following manner:

  • Amazon S3 table buckets become a multi-level catalog in the Data Catalog.

  • The associated Amazon S3 namespace is registered as a database in the Data Catalog.

  • The Amazon S3 tables in the table bucket becomes tables in the Data Catalog.

Mapping of objects between S3 Tables and Amazon Glue Data Catalog.

After integrating with Lake Formation, you can create Apache Iceberg tables in the table buckets catalog, and access them via integrated Amazon analytics engines such as Amazon Athena, Amazon EMR as well as third-party analytics engines.