Accessing a shared federated catalog
Amazon Lake Formation cross-account capabilities allow users to securely share distributed data lakes across multiple Amazon Web Services accounts, Amazon organizations, or directly with IAM principals in another account providing fine-grained access to the metadata and underlying data.
Lake Formation uses the Amazon Resource Access Manager (Amazon RAM) service to facilitate resource sharing. When you share a catalog resource with another account, Amazon RAM sends an invitation to the grantee account to accept or reject the resource grant.
Integrated analytical services such as Amazon Athena and Redshift Spectrum require resource links to be able to include shared resources in queries. Principals need to create a resource link in their Amazon Glue Data Catalog to a shared resource from another Amazon Web Services account. For more information about resource links, see How resource links work in Lake Formation.
A Catalog link container is a Data Catalog object, which references a local or cross-account federated database-level catalog from other Amazon accounts. You can also create database links and table links within a catalog link container. When you create a database link or a table link, you must specify a target resource that resides under the same target Amazon Redshift database-level catalog (Amazon Redshift database).
To create a catalog link container, you need the Lake Formation CREATE_CATALOG
or
the glue:CreateCatalog
permission.
Creating a catalog link container to a cross-account federated catalog
You can create a catalog link container that points to a Redshift database-level
federated catalog in any Amazon Region by using the Amazon Lake Formation console, Amazon Glue
CreateCatalog
API, or Amazon Command Line Interface (Amazon CLI).
To create a catalog link container to a shared catalog (console)
-
Open the Amazon Lake Formation console at https://console.amazonaws.cn/lakeformation/
. Sign in as a principal who has the Lake Formation CREATE_CATALOG
permission. -
In the navigation pane, choose Catalogs, and then choose Create catalog.
-
On the Set catalog details page, provide the following information:
- Name
-
Enter a name that adheres to the same rules as a catalog name. The name can be the same as the target shared catalog.
- Type
-
Choose Catalog link container as the type of catalog.
- Source
-
Choose
Redshift
. - Target Redshift catalog
-
Select a Redshift database-level federated catalog or choose a local (owned) catalog from the list.
The list contains all the catalogs shared with your account. Note the catalog owner account ID is listed with each catalog. If you don't see a catalog that you know was shared with your account, check the following:
-
If you aren't a data lake administrator, check that the data lake administrator granted you Lake Formation permissions on the catalog.
-
If you are a data lake administrator, and your account is not in the same Amazon organization as the granting account, ensure that you have accepted the Amazon Resource Access Manager (Amazon RAM) resource share invitation for the catalog. For more information, see Accepting a resource share invitation from Amazon RAM.
-
-
To enable Apache Iceberg query engines to read and write to Amazon Redshift namespaces, Amazon Glue creates a managed Amazon Redshift cluster with the compute and storage resources required to perform read and write operations without impacting Amazon Redshift data warehouse workloads. You need to provide an IAM role with the permissions required to transfer data to and from the Amazon S3 bucket.
-
Choose Next.
-
(Optional) Choose Add permissions to grant permissions to other principals.
However, granting permissions on a catalog link container doesn't grant permissions on the target (linked) catalog. You must grant permissions on the target catalog separately for the catalog link to be visible in Athena.
Next, review the catalog link container details and choose Create catalog.
You can then view the link container name under the Catalogs page.
Now, you can create database links and table links in the catalog link container to enable access from query engines.
Create a catalog link container CLI example
-
In the following example, the
TargetRedshiftCatalog
object specifies the arn of the Amazon Redshift federated database-level catalog (Amazon Redshift database). TheDataLakeAccess
must be enabled when you create the catalog link container.aws glue create-catalog \ --cli-input-json '{ "Name":
"linkcontainer"
, "CatalogInput": { "TargetRedshiftCatalog": { "CatalogArn":"arn:aws-cn:us-east-1:123456789012:catalog/nscatalog/dev"
}, "CatalogProperties": { "DataLakeAccessProperties" : { "DataLakeAccess" : true, "DataTransferRole" :"arn:aws:iam::111122223333:role/DataTransferRole"
} } } }'
Creating resource links under the catalog link container
You can create resource links to databases and tables links under a catalog link container. When you create database resource links or table resource links, you must specify a target resource that resides under the same target Amazon Redshift database-level catalog (Amazon Redshift database) that the link container points to.
You can create a resource link to a shared Amazon Redshift database or a table by using the Amazon Lake Formation console, API, or Amazon Command Line Interface (Amazon CLI).
-
For detailed instructions, see Creating a resource link to a shared Data Catalog database.
Following is a Amazon CLI example to create a database resource link under a catalog link container.
aws glue create-database \ --cli-input-json \ '{ "CatalogId":
"111122223333:linkcontainer"
, "DatabaseInput": { "Name":"dblink"
, "TargetDatabase": { "CatalogId":"123456789012:nscatalog/dev"
, "DatabaseName":"schema1"
} } }' -
To create a table resource link under a catalog link container, you need to first create a Amazon Glue database in the local Amazon Glue Data Catalog to contain the table resource link.
For more information on creating resource links to shared tables, see Creating a resource link to a shared Data Catalog table.
Create a database to contain the table resource link example
aws glue create-database \ --cli-input-json \ '{ "CatalogId":
"111122223333:linkcontainer"
, "DatabaseInput": { "Name":"db1"
, "Description":"creating parent database for table link"
} }'-
Create table resource link example
aws glue create-table \ --cli-input-json \ '{ "CatalogId":
"111122223333:linkcontainer"
, "DatabaseName":"db1"
, "TableInput": { "Name": "tablelink", "TargetTable": { "CatalogId":"123456789012:nscatalog/dev"
, "DatabaseName":"schema1"
, "Name":"table1"
} } }'