Querying metadata tables with Amazon analytics services
You can query your S3 managed metadata tables with Amazon analytics services such as Amazon Athena, Amazon Redshift, and Amazon EMR.
Before you can run queries, you must first integrate the Amazon managed S3 table buckets in your Amazon Web Services account and Region with Amazon analytics services.
Querying metadata tables with Amazon Athena
After you integrate your Amazon managed S3 table buckets with Amazon analytics services, you can start querying your metadata tables in Athena. In your queries, do the following:
-
Specify your catalog as
s3tablescatalog/aws-s3
and your database asb_
(which is typically the namespace for your metadata tables).general_purpose_bucket_name
-
Make sure to surround your metadata table namespace names in quotation marks (
"
) or backticks (`
), otherwise the query might not work.
For more information, see Querying Amazon S3 tables with Athena.
You can also run queries in Athena from the Amazon S3 console.
The following procedure uses the Amazon S3 console to access the Athena query editor so that you can query a table with Amazon Athena.
To query a metadata table
Sign in to the Amazon Web Services Management Console and open the Amazon S3 console at https://console.amazonaws.cn/s3/
. -
In the left navigation pane, choose General purpose buckets.
-
On the General purpose buckets tab, choose the bucket that contains the metadata configuration for the metadata table that you want to query.
-
On the bucket details page, choose the Metadata tab.
-
Choose Query table with Athena, and then choose one of the sample queries for journal or inventory tables.
-
The Amazon Athena console opens and the Athena query editor appears with a sample query loaded for you. Modify this query as needed for your use case.
In the query editor, the Catalog field should be populated with s3tablescatalog/aws-s3. The Database field should be populated with the namespace where your table is stored (for example, b_
general-purpose-bucket-name
).Note
If you don't see these values in the Catalog and Database fields, make sure that you've integrated your Amazon managed table bucket with Amazon analytics services in this Region. For more information, see Using Amazon S3 Tables with Amazon analytics services.
-
To run the query, choose Run.
Note
-
If you receive the error
"Insufficient permissions to execute the query. Principal does not have any privilege on specified resource"
when you try to run a query in Athena, you must be granted the necessary Lake Formation permissions on the table. For more information, see Granting permission on a table or database.Also make sure that you have the appropriate Amazon Identity and Access Management (IAM) permissions to query metadata tables. For more information, see Permissions for querying metadata tables.
-
If you receive the error
"Iceberg cannot access the requested resource"
when you try to run the query, go to the Amazon Lake Formation console and make sure that you've granted yourself permissions on the table bucket catalog and database (namespace) that you created. Don't specify a table when granting these permissions. For more information, see Granting permission on a table or database.
-
Querying metadata tables with Amazon Redshift
After you integrate your Amazon managed S3 table buckets with Amazon analytics services, do the following:
-
Create a resource link to your metadata table namespace (typically
b_
).general_purpose_bucket_name
-
Make sure to surround your metadata table namespace names in quotation marks (
"
) or backticks (`
), otherwise the query might not work.
After that's done, you can start querying your metadata tables in the Amazon Redshift console. For more information, see Accessing Amazon S3 tables with Amazon Redshift.
Querying metadata tables with Amazon EMR
To query your metadata tables by using Amazon EMR, you create an Amazon EMR cluster configured for Apache Iceberg and connect to your metadata tables using Apache Spark. You can set this up by integrating your Amazon managed S3 table buckets with Amazon analytics services or using the open-source Amazon S3 Tables Catalog for Iceberg client catalog.
Note
When using Apache Spark on Amazon EMR or other third-party engines to query your metadata tables, we recommend that you use the Amazon S3 Tables Iceberg REST endpoint. Your query might not run successfully if you don't use this endpoint. For more information, see Accessing tables using the Amazon S3 Tables Iceberg REST endpoint.
For more information, see Accessing Amazon S3 tables with Amazon EMR.