Integrating Amazon S3 Tables with Amazon analytics services - Amazon Simple Storage Service
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Integrating Amazon S3 Tables with Amazon analytics services

This topic covers the prerequisites and procedures needed to integrate your Amazon S3 table buckets with Amazon analytics services. For an overview of how the integration works, see S3 Tables integration overview.

Note

This integration uses the Amazon Glue Data Catalog and might incur Amazon Glue request and storage costs. For more information, see Amazon Glue Pricing.

Additional pricing applies for running queries on S3 Tables. For more information, see pricing information for the query engine that you're using.

Prerequisites for integration

The following prerequisites are required to integrate table buckets with Amazon analytics services:

Important

When creating tables, make sure that you use all lowercase letters in your table names and table definitions. For example, make sure that your column names are all lowercase. If your table name or table definition contains capital letters, the table isn't supported by Amazon Lake Formation or the Amazon Glue Data Catalog. In this case, your table won't be visible to Amazon analytics services such as Amazon Athena, even if your table buckets are integrated with Amazon analytics services.

If your table definition contains capital letters, you receive the following error message when running a SELECT query in Athena: "GENERIC_INTERNAL_ERROR: Get table request failed: com.amazonaws.services.glue.model.ValidationException: Unsupported Federation Resource - Invalid table or column names."

Integrating table buckets with Amazon analytics services

You can integrate table buckets with Data Catalog and Amazon analytics services using IAM access controls by default, or optionally use Lake Formation access controls.

When you integrate using IAM access controls, you require IAM privileges to access Amazon S3 table buckets and tables, Data Catalog objects, and the query engine you're using. If you choose to integrate using Lake Formation, then both IAM access controls and Lake Formation grants determine the access to Data Catalog resources. Please refer to Amazon Lake Formation Developer Guide to learn more about Lake Formation integration.

The following sections describe how you could use Amazon S3 management console or Amazon CLI to configure the integration with IAM access controls.

  1. Open the Amazon S3 console at https://console.amazonaws.cn/s3/.

  2. In the left navigation pane, choose Table buckets.

  3. Choose Create table bucket.

    The Create table bucket page opens.

  4. Enter a Table bucket name and make sure that the Enable integration checkbox is selected.

  5. Choose Create table bucket. Amazon S3 will attempt to automatically integrate your table buckets in that Region.

To integrate table buckets with IAM access controls using the Amazon CLI

The following steps show how to use the Amazon CLI to integrate table buckets. To use these steps, replace the user input placeholders with your own information.

  1. Create a table bucket.

    aws s3tables create-table-bucket \ --region us-east-1 \ --name amzn-s3-demo-table-bucket
  2. Create a file called catalog.json that contains the following catalog:

    { "Name": "s3tablescatalog", "CatalogInput": { "FederatedCatalog": { "Identifier": "arn:aws-cn:s3tables:us-east-1:111122223333:bucket/*", "ConnectionName": "aws:s3tables" }, "CreateDatabaseDefaultPermissions":[ { "Principal": { "DataLakePrincipalIdentifier": "IAM_ALLOWED_PRINCIPALS" }, "Permissions": ["ALL"] } ], "CreateTableDefaultPermissions":[ { "Principal": { "DataLakePrincipalIdentifier": "IAM_ALLOWED_PRINCIPALS" }, "Permissions": ["ALL"] } ], "AllowFullTableExternalDataAccess": "True" } }

    Create the s3tablescatalog catalog by using the following command. Creating this catalog populates the Amazon Glue Data Catalog with objects corresponding to table buckets, namespaces, and tables.

    aws glue create-catalog \ --region us-east-1 \ --cli-input-json file://catalog.json
  3. Verify that the s3tablescatalog catalog was added in Amazon Glue by using the following command:

    aws glue get-catalog --catalog-id s3tablescatalog

The Amazon analytics services integration process has been updated to use IAM permissions by default. If you've already set up the integration, you can continue to use your current integration. However, if you want to change your existing integration to use IAM permissions instead, see Amazon Lake Formation Developer Guide. You can also redo the integration to delete your existing setup in Amazon Glue Data Catalog and Amazon Lake Formation and re-run the integration. This will remove all existing Lake Formation grants and associated access permissions to the s3tablescatalog.

  1. Open the Amazon Lake Formation console at https://console.amazonaws.cn/lakeformation/, and sign in as a data lake administrator. For more information about how to create a data lake administrator, see Create a data lake administrator in the Amazon Lake Formation Developer Guide.

  2. Delete your s3tablescatalog catalog by doing the following:

    • In the left navigation pane, choose Catalogs.

    • Select the option button next to the s3tablescatalog catalog in the Catalogs list. On the Actions menu, choose Delete.

  3. Deregister the data location for the s3tablescatalog catalog by doing the following:

    • In the left navigation pane, go to the Administration section, and choose Data lake locations.

    • Select the option button next to the s3tablescatalog data lake location, for example, s3://tables:region:account-id:bucket/*.

    • On the Actions menu, choose Remove.

    • In the confirmation dialog box that appears, choose Remove.

  4. Now that you've deleted your s3tablescatalog catalog and data lake location, you can follow the steps to integrate your table buckets with Amazon analytics services by using the updated integration process.

Note

If you want to work with SSE-KMS encrypted tables in integrated Amazon analytics services, the role you use needs to have permission to use your Amazon KMS key for encryption operations. For more information, see Granting IAM principals permissions to work with encrypted tables in integrated Amazon analytics services.