Step 3: Create Lake Formation databases
In this step, you create two databases and attach LF-Tags to the databases and specific columns for testing purposes.
Create your databases and table for database-level access
-
First, create the database
tag_database
, the tablesource_data
, and attach appropriate LF-Tags.On the Lake Formation console (https://console.amazonaws.cn/lakeformation/
), under Data Catalog, choose Databases. Choose Create database.
For Name, enter
tag_database
.For Location, enter the Amazon S3 location created by the Amazon CloudFormation template
(s3://lf-tagbased-demo-
.Account-ID
/tag_database/)Deselect Use only IAM access control for new tables in this database.
Choose Create database.
-
Next, create a new table within
tag_database
.On the Databases page, select the database
tag_database
.ChooseView Tables and click Create table.
For Name, enter
source_data
.For Database, choose the database
tag_database
.For Table format, choose Standard Amazon Glue table.
For Data is located in, select Specified path in my account.
For Include path, enter the path to
tag_database
created by the Amazon CloudFormation template(s3://lf-tagbased-demo
.Account-ID
/tag_database/)For Data format, select CSV.
Under Upload schema, enter the following JSON array of column structure to create a schema:
[ { "Name": "vendorid", "Type": "string" }, { "Name": "lpep_pickup_datetime", "Type": "string" }, { "Name": "lpep_dropoff_datetime", "Type": "string" }, { "Name": "store_and_fwd_flag", "Type": "string" }, { "Name": "ratecodeid", "Type": "string" }, { "Name": "pulocationid", "Type": "string" }, { "Name": "dolocationid", "Type": "string" }, { "Name": "passenger_count", "Type": "string" }, { "Name": "trip_distance", "Type": "string" }, { "Name": "fare_amount", "Type": "string" }, { "Name": "extra", "Type": "string" }, { "Name": "mta_tax", "Type": "string" }, { "Name": "tip_amount", "Type": "string" }, { "Name": "tolls_amount", "Type": "string" }, { "Name": "ehail_fee", "Type": "string" }, { "Name": "improvement_surcharge", "Type": "string" }, { "Name": "total_amount", "Type": "string" }, { "Name": "payment_type", "Type": "string" } ]
Choose Upload. After uploading the schema, the table schema should look like the following screenshot:
Choose Submit.
-
Next, attach LF-Tags at the database level.
On the Databases page, find and select
tag_database
.On the Actions menu, choose Edit LF-Tags.
Choose Assign new LF-tag.
For Assigned keys¸ choose the
Confidential
LF-Tag you created earlier.For Values, choose
True
.Choose Save.
This completes the LF-Tag assignment to the tag_database database.
Create your database and table for column-level access
Repeat the following steps to create the database col_tag_database
and table source_data_col_lvl
, and attach LF-Tags at the column level.
On the Databases page, choose Create database.
-
For Name, enter
col_tag_database
. -
For Location, enter the Amazon S3 location created by the Amazon CloudFormation template
(s3://lf-tagbased-demo-
.Account-ID
/col_tag_database/) -
Deselect Use only IAM access control for new tables in this database.
-
Choose Create database.
On the Databases page, select your new database
(col_tag_database)
.Choose View tables and click Create table.
For Name, enter
source_data_col_lvl
.For Database, choose your new database
(col_tag_database)
.For Table format, choose Standard Amazon Glue table.
For Data is located in, select Specified path in my account.
Enter the Amazon S3 path for
col_tag_database
(s3://lf-tagbased-demo-
.Account-ID
/col_tag_database/)For Data format, select
CSV
.Under
Upload schema
, enter the following schema JSON:[ { "Name": "vendorid", "Type": "string" }, { "Name": "lpep_pickup_datetime", "Type": "string" }, { "Name": "lpep_dropoff_datetime", "Type": "string" }, { "Name": "store_and_fwd_flag", "Type": "string" }, { "Name": "ratecodeid", "Type": "string" }, { "Name": "pulocationid", "Type": "string" }, { "Name": "dolocationid", "Type": "string" }, { "Name": "passenger_count", "Type": "string" }, { "Name": "trip_distance", "Type": "string" }, { "Name": "fare_amount", "Type": "string" }, { "Name": "extra", "Type": "string" }, { "Name": "mta_tax", "Type": "string" }, { "Name": "tip_amount", "Type": "string" }, { "Name": "tolls_amount", "Type": "string" }, { "Name": "ehail_fee", "Type": "string" }, { "Name": "improvement_surcharge", "Type": "string" }, { "Name": "total_amount", "Type": "string" }, { "Name": "payment_type", "Type": "string" } ]
Choose
Upload
. After uploading the schema, the table schema should look like the following screenshot.Choose Submit to complete the creation of the table.
-
Now, associate the
Sensitive=True
LF-Tag to the columnsvendorid
andfare_amount
.On the Tables page, select the table you created
(source_data_col_lvl)
.On the Actions menu, choose Schema.
Select the column
vendorid
and choose Edit LF-Tags.For Assigned keys, choose Sensitive.
For Values, choose True.
Choose Save.
-
Next, associate the
Confidential=False
LF-Tag tocol_tag_database
. This is required forlf-data-analyst
to be able to describe the databasecol_tag_database
when logged in from Amazon Athena.On the Databases page, find and select
col_tag_database
.On the Actions menu, choose Edit LF-Tags.
Choose Assign new LF-Tag.
For Assigned keys, choose the
Confidential
LF-Tag you created earlier.For Values, choose
False
.Choose Save.