VPC connectivity for S3 Tables
All tables in S3 Tables are in the Apache Iceberg format and are made up of two types of S3 objects. These two types of objects are
data files which store the data and metadata files which track information about the data files at different points in time.
All table bucket, namespace, and table operations (for example, CreateNamespace
, CreateTable
, and so on) are routed through
an S3 Tables endpoint (s3tables.region.amazonaws.com
) and all object-level operations that read or write the data and metadata files continue
to be routed through an S3 service endpoint (s3.region.amazonaws.com
).
To access S3 Tables, Amazon S3 supports two types of VPC endpoints by using Amazon PrivateLink: gateway endpoints and interface endpoints. A gateway endpoint is a gateway that you specify in your route table to access S3 from your VPC over the Amazon network. Interface endpoints extend the functionality of gateway endpoints by using private IP addresses to route requests to Amazon S3 from within your VPC, on premises, or from a VPC in another Amazon Web Services Region by using VPC peering or Amazon Transit Gateway.
To access S3 Tables from a VPC, we recommend creating two VPC endpoints (one for S3 and the other for S3 Tables). You can create either a gateway or an interface endpoint to route file (object) level operations to S3 and an interface endpoint to route bucket and table-level operations to S3 Tables. You can create and use VPC endpoints for file-level requests using S3. For more information, see Gateway endpoints in the Amazon PrivateLink User Guide.
To learn more about using Amazon PrivateLink to create and work with endpoints for S3 Tables, see the following topics. To create a VPC interface endpoint, see Create a VPC endpoint in the Amazon PrivateLink Guide.
Topics
Creating VPC endpoints for S3 Tables
When you create a VPC endpoint, S3 Tables generates two types of endpoint-specific DNS names: Regional and Zonal.
-
A Regional DNS name is of the following format:
VPCendpointID.s3tables.AWSregion.vpce.amazonaws.com
. For example, for the VPC endpoint IDvpce-1a2b3c4d
, the DNS name generated will be similar tovpce-1a2b3c4d-5e6f.s3tables.us-east-1.vpce.amazonaws.com
-
A Zonal DNS name is of the following format:
VPCendpointID-AvailabilityZone.s3tables.AWSregion.vpce.amazonaws.com
. For example, For the VPC endpoint IDvpce-1a2b3c4d-5e6f.
, the DNS name generated will be similar tovpce-1a2b3c4d-5e6f-us-east-1a.s3tables.us-east-1.vpce.amazonaws.com
A Zonal DNS name includes your Availability Zone. You might use Zonal DNS names if your architecture isolates Availability Zones. Endpoint specific S3 DNS names can be resolved from the S3 public DNS domain.
You can also use Private DNS options to simplify routing S3 traffic over VPC endpoints
and help you take advantage of the lowest-cost network path available to your
application. Private DNS maps the public endpoint of S3 Tables, for instance,
s3tables.region.amazonaws.com
, to a private IP in your VPC. You can
use private DNS options to route Regional S3 traffic without updating your S3 clients to
use the endpoint-specific DNS names of your interface endpoints.
Note
Amazon PrivateLink for Amazon S3 doesn't support using Amazon S3 dual-stack endpoints. For more information,
see Using Amazon S3 dual-stack endpoints
Accessing table buckets and tables through endpoints using the Amazon CLI
You can use the Amazon Command Line Interface (Amazon CLI) to access table buckets and tables through the
interface endpoints. With the Amazon CLI, aws s3
commands route traffic through
the Amazon S3 endpoint. The aws s3tables
Amazon CLI commands use the Amazon
S3 Tables endpoint.
An example of an s3tables
VPC endpoint is
vpce-0123456afghjipljw-nmopsqea.s3tables.region.vpce.amazonaws.com
An s3tables
VPC endpoint doesn't include a bucket name. You can
access the s3tables
VPC endpoint using the aws s3tables
Amazon CLI commands.
An example of an s3
VPC endpoint is amzn-s3-demo-bucket.vpce-0123456afghjipljw-nmopsqea.s3.region.vpce.amazonaws.com
You can access the s3
VPC endpoint using the aws s3
Amazon CLI commands.
To access table buckets and tables through interface endpoints using the Amazon CLI,
use the -region
- and --endpoint-url
parameters. To
perform table bucket and table level actions, use the S3 Tables endpoint URL.
To perform object level actions, use the Amazon S3 endpoint URL.
In the following examples, replace the user input
placeholders
with your own information.
Example 1: Use an endpoint URL to list table buckets in your account
aws s3tables list-table-buckets --endpoint https://
vpce-0123456afghjipljb-aac
.s3tables.us-east-1
.vpce.amazonaws.com —regionus-east-1
Example 2: Use an endpoint URL to list tables in your bucket
aws s3tables list-tables --table-bucket-arn arn:aws:s3tables:
us-east-1
:123456789301
:bucket/amzn-s3-demo-bucket
--endpoint https://vpce-0123456afghjipljb-aac
.s3tables.us-east-1
.vpce.amazonaws.com --regionus-east-1
Configuring a VPC network when using query engines
Use the following steps to configure a VPC network when using query engines.
-
To get started, you can create or update a VPC. For more information, see Create a VPC.
-
For table and table bucket level operations that route to S3 Tables, create a new interface endpoint. For more information, see Access an Amazon service using an interface VPC endpoint.
-
For all object level operations that route to Amazon S3, create a gateway endpoint or a interface endpoint. For more information on gateway endpoints, see Create a gateway endpoint.
-
Next, configure your data resources and launch an Amazon EMR cluster. For more information, see Getting started with Amazon EMR.
-
You can then submit a Spark application with an additional configuration by selecting your DNS names from the VPC endpoint. For example,
spark.sql.catalog.ice_catalog.s3tables.endpoint
andhttps://interface-endpoint.s3tables.
For more information, see Submit work to your Amazon EMR cluster.us-east-1
.vpce.amazonaws.com
Restricting access to S3 Tables within the VPC network
Similar to resource-based policies, you can attach an endpoint policy to your VPC endpoint that controls the access to tables and table buckets. In the following example, the interface endpoint policy restricts access to only specific table buckets.
{ "Version": "2012-10-17", "Id": "Policy141511512309", "Statement": [{ "Sid": "Access-to-specific-bucket-only", "Principal": "*", "Action": "s3tables:*", "Effect": "Allow", "Resource": [ "arn:aws:s3tables:
region
:account_id
:bucket/amzn-s3-demo-bucket
", "arn:aws:s3tables:region
:account_id
:bucket/amzn-s3-demo-bucket
/*" ] }] }