

# Register and use data catalogs in Athena
<a name="gdc-register"></a>

Athena supports mounting and connecting to multiple data catalogs. 
+ You can mount Amazon Redshift data in the Amazon Glue Data Catalog and query it from Athena without having to copy or move data. For more information, see [Bringing Amazon Redshift data into the Amazon Glue Data Catalog](https://docs.amazonaws.cn/lake-formation/latest/dg/managing-namespaces-datacatalog.html).
+  Connect the Amazon Glue Data Catalog to external data sources using Amazon Glue connections, and create federated catalogs to centrally manage permissions to the data with fine-grained access control using Lake Formation. For more information, see [Register your connection as a Glue Data Catalog](register-connection-as-gdc.md).
+ Create catalogs from Amazon S3 table buckets, and use Lake Formation to centrally manage access permissions and restrict user access to objects within the table bucket. For more information, see [Working with Amazon S3 Tables and table buckets](https://docs.amazonaws.cn/AmazonS3/latest/userguide/s3-tables.html) in the Amazon S3 User Guide.

**Note**  
For any Glue catalog, you can only register a multi-level catalog like `123412341234:my_catalog/my_child`. You cannot register a single-level catalog like `123412341234:linkcontainer` or `my_catalog`. Single-level catalogs can only be queried by using the Glue data catalog directly in the Athena query. For more information, see [Query Amazon Glue data catalogs in Athena](gdc-register-query-the-data-source.md). 

**Topics**
+ [

# Register Redshift data catalogs in Athena
](gdc-register-rs.md)
+ [

# Register federated catalogs in Athena
](gdc-register-federated.md)
+ [

# Register S3 table bucket catalogs and query Tables from Athena
](gdc-register-s3-table-bucket-cat.md)
+ [

# Query Amazon Glue data catalogs in Athena
](gdc-register-query-the-data-source.md)

# Register Redshift data catalogs in Athena
<a name="gdc-register-rs"></a>

Athena can read and write data stored in Redshift clusters or serverless namespaces that have been registered in the Amazon Glue Data Catalog. This works in tandem with Amazon Lake Formation, which provides centralized security and governance, ensuring that data access is managed consistently across different query engines and maintaining fine-grained access controls for the shared Redshift data.

## Considerations and limitations
<a name="gdc-register-rs-considerations-and-limitations"></a>
+ **Materialized views** – Amazon Redshift materialized views are queryable from Athena but creating materialized views using Athena or Spark is not supported.
+ DDL operations, including setting Amazon Glue Data Catalog configuration and operations on Amazon Redshift managed storage tables, are not supported.

## Prerequisites
<a name="gdc-register-rs-prerequisites"></a>

Before you can query a Amazon Glue data catalog from Athena, complete the following tasks:

1. Create and register an Amazon Redshift cluster or serverless namespace to the Amazon Glue Data Catalog. For more information, see [Registering a cluster to the Amazon Glue Data Catalog](https://docs.amazonaws.cn/redshift/latest/mgmt/register-cluster.html) or [Registering namespaces to the Amazon Glue Data Catalog](https://docs.amazonaws.cn/redshift/latest/mgmt/serverless_datasharing-register-namespace.html) in the Amazon Redshift Management guide.

1. Create a data catalog in Amazon Lake Formation from the registered namespace. For more information, see [Creating Amazon Redshift federated catalogs](https://docs.amazonaws.cn/lake-formation/latest/dg/create-ns-catalog.html) in the Amazon Lake Formation Developer Guide.

1. (Optional) Use Lake Formation to set fine-grained access controls on the catalog. For more information, see [Bringing your data into the Amazon Glue Data Catalog](https://docs.amazonaws.cn/lake-formation/latest/dg/bring-your-data-overview.html) in the Amazon Lake Formation Developer Guide.

## Register a Redshift data catalog with the Athena console
<a name="gdc-register-rs-console-steps"></a>

To register a Redshift data catalog with the Athena console, perform the following steps.

1. Open the Athena console at [https://console.aws.amazon.com/athena/](https://console.aws.amazon.com/athena/).

1. In the navigation pane, choose **Data sources and catalogs**.

1. On the **Data sources and catalogs** page, choose **Create data source**.

1. For **Choose a data source**, choose **Amazon S3 - Amazon Glue Data Catalog**.

1. In the **Amazon Glue Data Catalog** section, for **Data source account**, choose **Amazon Glue Data Catalog in this account**.

1. For **Create a table or register a catalog**, choose **Register a new Amazon Glue Catalog**.

1. In the **Data source details** section, for **Data source name**, enter the name that you want to use to specify the data source in your SQL queries or use the default name that is generated.

1. For **Catalog**, choose **Browse** to search for a list of Amazon Glue catalogs in the same account. If you don't see any existing catalogs, create one in [Amazon Glue console](https://console.aws.amazon.com/glue/). 

1. In the **Browse Amazon Glue catalogs** dialog box, select the catalog that you want to use, and then choose **Choose**.

1. (Optional) For **Tags**, enter any key/value pairs that you want to associate with the data source.

1. Choose **Next**.

1. On the **Review and create** page, verify that the information that you entered is correct, and then choose **Create data source**.

# Register federated catalogs in Athena
<a name="gdc-register-federated"></a>

After you create connections to federated data sources, you can register them as federated data catalogs for simplified data discovery and manage data access with fine-grained permissions using Lake Formation. For more information, see [Register your connection as a Glue Data Catalog](register-connection-as-gdc.md).

## Considerations and limitations
<a name="gdc-register-federated-consideration"></a>
+ DDL operations are not supported on federated catalogs. 
+ You can register the following connectors to integrate with Amazon Glue for fine-grained access control:
  + [Azure Data Lake Storage](connectors-adls-gen2.md)
  + [Azure Synapse](connectors-azure-synapse.md)
  + [BigQuery](connectors-bigquery.md)
  + [CMDB](connectors-cmdb.md)
  + [Db2](connectors-ibm-db2.md)
  + [Db2 iSeries](connectors-ibm-db2-as400.md)
  + [DocumentDB](connectors-docdb.md)
  + [DynamoDB](connectors-dynamodb.md)
  + [Google Cloud Storage](connectors-gcs.md)
  + [HBase](connectors-hbase.md)
  + [MySQL](connectors-mysql.md)
  + [OpenSearch](connectors-opensearch.md)
  + [Oracle](connectors-oracle.md)
  + [PostgreSQL](connectors-postgresql.md)
  + [Redshift](connectors-redshift.md)
  + [SAP HANA](connectors-sap-hana.md)
  + [Snowflake](connectors-snowflake.md)
  + [SQL Server](connectors-microsoft-sql-server.md)
  + [Timestream](connectors-timestream.md)
  + [TPC-DS](connectors-tpcds.md)
+ When you create a resource link for Glue connection federation, the name of [resource link](https://docs.amazonaws.cn/lake-formation/latest/dg/create-resource-link-database.html) must be same as the database name of the producer.
+ Currently, only lowercase table and column names are recognized even if the data source is case insensitive.

# Register S3 table bucket catalogs and query Tables from Athena
<a name="gdc-register-s3-table-bucket-cat"></a>

Amazon S3 table buckets are a bucket type in Amazon S3 that is purpose-built to store tabular data in Apache Iceberg tables. Table buckets automate table management tasks such as compaction, snapshot management, and garbage collection to continuously optimize query performance and minimize cost. Whether you're just starting out, or have thousands of tables in your Iceberg environment, table buckets simplify data lakes at any scale. For more information, see [Table buckets](https://docs.amazonaws.cn/AmazonS3/latest/userguide/s3-tables-buckets.html).

## Considerations and limitations
<a name="gdc-register-s3-table-consideration"></a>
+ All DDL operations supported for Iceberg tables are supported for S3 Tables with the following exceptions:
  + `ALTER TABLE RENAME`, `CREATE VIEW`, and `ALTER DATABASE` are not supported.
  + `OPTIMIZE` and `VACUUM` – You can manage compaction and snapshot management in S3. For more information, see [S3 Tables maintenance documentation](https://docs.amazonaws.cn/AmazonS3/latest/userguide/s3-tables-maintenance.html).
+ DDL queries on S3 Tables registered as Athena data sources are not supported.
+ Query result reuse is not supported.
+ In workgroups with SSE-KMS, CSE-KMS encryption enabled, you can't run write operations like `INSERT`, `UPDATE`, `DELETE`, or `MERGE` on S3 Tables.
+ In workgroups with S3 Requester Pays option enabled, you can't run DML operations on S3 Tables.

## Query S3 Tables from Athena
<a name="gdc-register-s3-table-prereq-setup"></a>

**Complete these prerequisite steps before you query S3 Tables in Athena**

1. Create an S3 table bucket. For more information, see [Creating a table bucket](https://docs.amazonaws.cn/AmazonS3/latest/userguide/s3-tables-buckets-create.html) in Amazon Simple Storage Service User Guide. 

1. Make sure that the integration of your table buckets with Amazon Glue Data Catalog is successful. For required permissions and setup steps, see [Prerequisites for S3 Tables integration](https://docs.amazonaws.cn/glue/latest/dg/s3tables-catalog-prerequisites.html) and [Enabling S3 Tables integration with Glue Data Catalog](https://docs.amazonaws.cn/glue/latest/dg/enable-s3-tables-catalog-integration.html) in the Amazon Glue Developer Guide.

1. For the principal you use to run queries with Athena, grant permissions on the S3 Table catalog using one of the following approaches: 

   **Option 1: Use IAM permissions**

   When using IAM access control, your principal needs permissions on both Amazon Glue Data Catalog resources and Amazon S3 Tables resources.

   The following list contains all `s3tables` permissions required to perform any supported DDL or DML operation against your S3 Tables in Athena:
   + `s3tables:GetTableBucket`
   + `s3tables:GetNamespace`
   + `s3tables:GetTable`
   + `s3tables:GetTableData`
   + `s3tables:PutTableData`
   + `s3tables:ListNamespaces`
   + `s3tables:ListTables`
   + `s3tables:DeleteNamespace`
   + `s3tables:DeleteTable`
   + `s3tables:CreateNamespace`
   + `s3tables:CreateTable`
   + `s3tables:UpdateTableMetadataLocation`

   Apply these permissions to specific S3 table bucket and S3 Table resources or use `*` as the resource to grant access to all table buckets and tables in your account. These permissions can be combined with the [https://docs.aws.amazon.com/aws-managed-policy/latest/reference/AmazonAthenaFullAccess.html](https://docs.aws.amazon.com/aws-managed-policy/latest/reference/AmazonAthenaFullAccess.html) managed policy to enable complete functionality.

   **Option 2: Use Lake Formation permissions**

   Alternatively, to enable fine-grained access control you can grant Lake Formation permissions on the S3 Table catalog, either through the Lake Formation console or Amazon CLI. This requires registering your S3 table buckets as a Lake Formation data location. For more information, see [Creating an Amazon S3 Tables catalog in the Amazon Glue Data Catalog](https://docs.amazonaws.cn/lake-formation/latest/dg/create-s3-tables-catalog.html) in the Lake Formation Developer Guide.

------
#### [ Amazon Web Services Management Console ]

   1. Open the Amazon Lake Formation console at [https://console.amazonaws.cn/lakeformation/](https://console.amazonaws.cn/lakeformation/) and sign in as a data lake administrator. For more information on how to create a data lake administrator, see [Create a data lake administrator](https://docs.amazonaws.cn/lake-formation/latest/dg/initial-lf-config.html#create-data-lake-admin).

   1. In the navigation pane, choose **Data permissions** and then choose **Grant**. 

   1. On the **Grant Permissions** page, under **Principals**, choose the principal that you want to use to submit query from Athena.

   1. Under **LF-Tags or catalog resources**, choose **Named Data Catalog resources**.

   1. For **Catalogs**, choose a glue data catalog that you created from the integration of your table bucket. For example, *<accoundID>*:s3tablescatalog/*amzn-s3-demo-bucket*.

   1. For **Catalog permissions**, choose **Super**.

   1. Choose **Grant**.

------
#### [ Amazon CLI ]

   Run the following command with Lake Formation data lake administrator role to grant access to the principal that you use to submit query from Athena. 

   ```
   aws lakeformation grant-permissions \
   --region <region (Example,us-east-1)> \
   --cli-input-json \
   '{
       "Principal": {
           "DataLakePrincipalIdentifier": "<user or role ARN (Example, arn:aws:iam::<Account ID>:role/ExampleRole>"
       },
       "Resource": {
           "Catalog": {
               "Id":"<Account ID>:s3tablescatalog/amzn-s3-demo-bucket"
           }
       },
       "Permissions": ["ALL"]
   }'
   ```

------

**Submit queries for S3 Tables**

1. Submit a `CREATE DATABASE` query from Athena with the above granted user/role. In this example, `s3tablescatalog` is the parent Glue Data Catalog created from the integration and ` s3tablescatalog/amzn-s3-demo-bucket` is the child Glue Data Catalog created for each S3 table bucket. There are two ways in which you can query.

------
#### [ Option 1 ]

   Specify the child Glue Data Catalog (`s3tablescatalog/amzn-s3-demo-bucket`) directly from console or Amazon CLI.

   **Using Amazon Web Services Management Console**

   1. Open the Athena console at [https://console.aws.amazon.com/athena/](https://console.aws.amazon.com/athena/).

   1. In the left navigation, for **Data source name**, choose **AwsDataCatalog**.

   1. For **Catalog**, choose **s3tablescatalog/*amzn-s3-demo-bucket***.

   1. In the query editor, enter a query like `CREATE DATABASE test_namespace`.

   **Using Amazon CLI**

   Run the following command.

   ```
   aws athena start-query-execution \ 
   --query-string 'CREATE DATABASE `test_namespace`' \ 
   --query-execution-context '{"Catalog": "s3tablescatalog/amzn-s3-demo-bucket"}' \
   --work-group "primary"
   ```

------
#### [ Option 2 ]

   Create Athena data catalog from the child Glue Data Catalog in the Athena console and specify it as a catalog in the query. For more information, see [Register S3 table bucket catalogs as Athena data sources](#gdc-register-s3-table-console-steps).

------

1. With the database that you created in previous step, use `CREATE TABLE` to create a table. The following example creates a table in the *`test_namespace`* database that you previously created in the `s3tablescatalog/amzn-s3-demo-bucket` Glue catalog.

------
#### [ Amazon Web Services Management Console ]

   1. In the left navigation, for **Data source name**, choose **AwsDataCatalog**.

   1. For **Catalog**, choose **s3tablescatalog/*amzn-s3-demo-bucket***.

   1. For **Database**, choose **test\$1namespace**.

   1. In the query editor, run the following query.

      ```
      CREATE TABLE daily_sales (
              sale_date date,
              product_category
              string, sales_amount double)
      PARTITIONED BY (month(sale_date))
      TBLPROPERTIES ('table_type' = 'iceberg')
      ```

------
#### [ Amazon CLI ]

   Run the following command.

   ```
   aws athena start-query-execution \
   --query-string "CREATE TABLE daily_sales (
           sale_date date,
           product_category
           string, sales_amount double)
   PARTITIONED BY (month(sale_date))
   TBLPROPERTIES ('table_type' = 'iceberg')" \
   --query-execution-context '{"Catalog": "s3tablescatalog/amzn-s3-demo-bucket", "Database":"test_namespace"}' \
   --work-group "primary"
   ```

------

1. Insert data into the table that you created in the previous step.

------
#### [ Amazon Web Services Management Console ]

   1. In the left navigation, for **Data source name**, choose **AwsDataCatalog**.

   1. For **Catalog**, choose **s3tablescatalog/*amzn-s3-demo-bucket***.

   1. For **Database**, choose **test\$1namespace**.

   1. In the query editor, run the following query.

      ```
      INSERT INTO daily_sales
      VALUES 
          (DATE '2024-01-15', 'Laptop', 900.00),
          (DATE '2024-01-15', 'Monitor', 250.00),
          (DATE '2024-01-16', 'Laptop', 1350.00),
          (DATE '2024-02-01', 'Monitor', 300.00);
      ```

------
#### [ Amazon CLI ]

   Run the following command.

   ```
   aws athena start-query-execution \
   --query-string "INSERT INTO \"s3tablescatalog/amzn-s3-demo-bucket\".test_namespace.daily_sales
   VALUES 
   (DATE '2024-01-15', 'Laptop', 900.00),
   (DATE '2024-01-15', 'Monitor', 250.00),
   (DATE '2024-01-16', 'Laptop', 1350.00),
   (DATE '2024-02-01', 'Monitor', 300.00)"\ 
   --work-group "primary"
   ```

------

1. After inserting data into the table, you can query it.

------
#### [ Amazon Web Services Management Console ]

   1. In the left navigation, for **Data source name**, choose **AwsDataCatalog**.

   1. For **Catalog**, choose **s3tablescatalog/*amzn-s3-demo-bucket***.

   1. For **Database**, choose **test\$1namespace**.

   1. In the query editor, run the following query.

      ```
      SELECT
          product_category,
          COUNT(*) AS units_sold,
          SUM(sales_amount) AS total_revenue,
          AVG(sales_amount) AS average_price
      FROM
          daily_sales
      WHERE
          sale_date BETWEEN DATE '2024-02-01' 
                       AND DATE '2024-02-29'
      GROUP BY
          product_category
      ORDER BY
          total_revenue DESC
      ```

------
#### [ Amazon CLI ]

   Run the following command.

   ```
   aws athena start-query-execution \
   --query-string "SELECT product_category,
       COUNT(*) AS units_sold,
       SUM(sales_amount) AS total_revenue,
       AVG(sales_amount) AS average_price
   FROM \"s3tablescatalog/amzn-s3-demo-bucket\".test_namespace.daily_sales
   WHERE sale_date BETWEEN DATE '2024-02-01' AND DATE '2024-02-29'
   GROUP BY product_category
   ORDER BY total_revenue DESC"\
   --work-group "primary"
   ```

------

## Create S3 Tables in Athena
<a name="gdc-create-s3-tables-athena"></a>

Athena supports creating tables in existing S3 Table namespaces or namespaces created in Athena with `CREATE DATABASE` statements. To create an S3 Table from Athena, the syntax is the same as when you [create a regular Iceberg table](querying-iceberg-creating-tables.md) except you don't specify the `LOCATION`, as shown in the following example.

```
CREATE TABLE
[db_name.]table_name (col_name data_type [COMMENT col_comment] [, ...] )
[PARTITIONED BY (col_name | transform, ... )]
[TBLPROPERTIES ([, property_name=property_value] )]
```

You can also create S3 Tables using CREATE TABLE AS SELECT (CTAS) statements. For more information, see [CTAS for S3 Tables](#ctas-s3-tables).

## Register S3 table bucket catalogs as Athena data sources
<a name="gdc-register-s3-table-console-steps"></a>

To register S3 table bucket catalogs with the Athena console, perform the following steps.

1. Open the Athena console at [https://console.aws.amazon.com/athena/](https://console.aws.amazon.com/athena/).

1. In the navigation pane, choose **Data sources and catalogs**.

1. On the **Data sources and catalogs** page, choose **Create data source**.

1. For **Choose a data source**, choose **Amazon S3 - Amazon Glue Data Catalog**.

1. In the **Amazon Glue Data Catalog** section, for **Data source account**, choose **Amazon Glue Data Catalog in this account**.

1. For **Create a table or register a catalog**, choose **Register a new Amazon Glue Catalog**.

1. In the **Data source details** section, for **Data source name**, enter the name that you want to use to specify the data source in your SQL queries or use the default name that is generated.

1. For **Catalog**, choose **Browse** to search for a list of Amazon Glue catalogs in the same account. If you don't see any existing catalogs, create one in [Amazon Glue console](https://console.aws.amazon.com/glue/). 

1. In the **Browse Amazon Glue catalogs** dialog box, select the catalog that you want to use, and then choose **Choose**.

1. (Optional) For **Tags**, enter any key/value pairs that you want to associate with the data source.

1. Choose **Next**.

1. On the **Review and create** page, verify that the information that you entered is correct, and then choose **Create data source**.

## CTAS for S3 Tables
<a name="ctas-s3-tables"></a>

Amazon Athena now supports CREATE TABLE AS SELECT (CTAS) operations for S3 Tables. This feature enables you to create new S3 Tables based on the results of a SELECT query. 

When creating a CTAS query for an S3 Table, there are a few important differences compared to standard Athena tables:
+ You must omit the location property because S3 Tables automatically manage their own storage locations.
+ The `table_type` property defaults to `ICEBERG`, so you don't need to explicitly specify it in your query.
+ If you don't specify a format, the system automatically uses `PARQUET` as the default format for your data.
+ All other properties follow the same syntax as regular Iceberg tables.

Before you create S3 Tables using CTAS, ensure that you have the necessary permissions configured in IAM or Amazon Lake Formation. Specifically, you need permissions to create tables in the S3 Tables catalog. Without these permissions, your CTAS operations will fail.

**Note**  
If your CTAS query fails, you might have to delete your table using the [S3 Tables API](https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-tables-delete.html) before attempting to re-run your query. you cannot use the Athena `DROP TABLE` statements to remove the table that was partially created by the query. 

**Example**

```
CREATE TABLE "s3tablescatalog/amzn-s3-demo-bucket"."namespace"."s3-table-name"
WITH (
    format = 'PARQUET'
)
AS SELECT *
FROM source_table;
```

# Query Amazon Glue data catalogs in Athena
<a name="gdc-register-query-the-data-source"></a>

To query data catalogs from Athena, do one of the following.
+ Register the catalog in Athena as a data source, then use the data source name to query the catalog. In this usage, the following queries are equivalent.

  ```
  SELECT * FROM my_data_source.my_database.my_table
  ```
+ If you are querying a catalog that has not been registered as an Athena data source, you can supply the full path to the catalog in your `SELECT` queries, as in the following example.

  ```
  SELECT * FROM "my_catalog/my_subcatalog".my_database.my_table
  ```
+ You can also do this through the Amazon Web Services Management Console.

  1. Open the Athena console at [https://console.aws.amazon.com/athena/](https://console.aws.amazon.com/athena/)

  1. In the query editor, for **Data source**, choose **AwsDataCatalog**.

  1. For **Catalog**, choose the name of the catalog that you want to use.

  1. For **Database**, choose the database that contains the table that you want to query.

  1. Enter a query like `SELECT * FROM my_table`, and then choose **Run**.