Step 3: Create Lake Formation databases - Amazon Lake Formation
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Step 3: Create Lake Formation databases

In this step, you create two databases and attach LF-Tags to the databases and specific columns for testing purposes.

Create your databases and table for database-level access
  1. First, create the database tag_database, the table source_data, and attach appropriate LF-Tags.

    1. On the Lake Formation console (https://console.amazonaws.cn/lakeformation/), under Data Catalog, choose Databases.

    2. Choose Create database.

    3. For Name, enter tag_database.

    4. For Location, enter the Amazon S3 location created by the Amazon CloudFormation template (s3://lf-tagbased-demo-Account-ID/tag_database/).

    5. Deselect Use only IAM access control for new tables in this database.

    6. Choose Create database.

  2. Next, create a new table within tag_database.

    1. On the Databases page, select the database tag_database.

    2. ChooseView Tables and click Create table.

    3. For Name, enter source_data.

    4. For Database, choose the database tag_database.

    5. For Table format, choose Standard Amazon Glue table.

    6. For Data is located in, select Specified path in my account.

    7. For Include path, enter the path to tag_database created by the Amazon CloudFormation template (s3://lf-tagbased-demoAccount-ID/tag_database/).

    8. For Data format, select CSV.

    9. Under Upload schema, enter the following JSON array of column structure to create a schema:

      [ { "Name": "vendorid", "Type": "string" }, { "Name": "lpep_pickup_datetime", "Type": "string" }, { "Name": "lpep_dropoff_datetime", "Type": "string" }, { "Name": "store_and_fwd_flag", "Type": "string" }, { "Name": "ratecodeid", "Type": "string" }, { "Name": "pulocationid", "Type": "string" }, { "Name": "dolocationid", "Type": "string" }, { "Name": "passenger_count", "Type": "string" }, { "Name": "trip_distance", "Type": "string" }, { "Name": "fare_amount", "Type": "string" }, { "Name": "extra", "Type": "string" }, { "Name": "mta_tax", "Type": "string" }, { "Name": "tip_amount", "Type": "string" }, { "Name": "tolls_amount", "Type": "string" }, { "Name": "ehail_fee", "Type": "string" }, { "Name": "improvement_surcharge", "Type": "string" }, { "Name": "total_amount", "Type": "string" }, { "Name": "payment_type", "Type": "string" } ]
    10. Choose Upload. After uploading the schema, the table schema should look like the following screenshot:

    11. Choose Submit.

  3. Next, attach LF-Tags at the database level.

    1. On the Databases page, find and select tag_database.

    2. On the Actions menu, choose Edit LF-Tags.

    3. Choose Assign new LF-tag.

    4. For Assigned keys¸ choose the Confidential LF-Tag you created earlier.

    5. For Values, choose True.

    6. Choose Save.

    This completes the LF-Tag assignment to the tag_database database.

Create your database and table for column-level access

Repeat the following steps to create the database col_tag_database and table source_data_col_lvl, and attach LF-Tags at the column level.

  1. On the Databases page, choose Create database.

  2. For Name, enter col_tag_database.

  3. For Location, enter the Amazon S3 location created by the Amazon CloudFormation template (s3://lf-tagbased-demo-Account-ID/col_tag_database/).

  4. Deselect Use only IAM access control for new tables in this database.

  5. Choose Create database.

  6. On the Databases page, select your new database (col_tag_database).

  7. Choose View tables and click Create table.

  8. For Name, enter source_data_col_lvl.

  9. For Database, choose your new database (col_tag_database).

  10. For Table format, choose Standard Amazon Glue table.

  11. For Data is located in, select Specified path in my account.

  12. Enter the Amazon S3 path for col_tag_database (s3://lf-tagbased-demo-Account-ID/col_tag_database/).

  13. For Data format, select CSV.

  14. Under Upload schema, enter the following schema JSON:

    [ { "Name": "vendorid", "Type": "string" }, { "Name": "lpep_pickup_datetime", "Type": "string" }, { "Name": "lpep_dropoff_datetime", "Type": "string" }, { "Name": "store_and_fwd_flag", "Type": "string" }, { "Name": "ratecodeid", "Type": "string" }, { "Name": "pulocationid", "Type": "string" }, { "Name": "dolocationid", "Type": "string" }, { "Name": "passenger_count", "Type": "string" }, { "Name": "trip_distance", "Type": "string" }, { "Name": "fare_amount", "Type": "string" }, { "Name": "extra", "Type": "string" }, { "Name": "mta_tax", "Type": "string" }, { "Name": "tip_amount", "Type": "string" }, { "Name": "tolls_amount", "Type": "string" }, { "Name": "ehail_fee", "Type": "string" }, { "Name": "improvement_surcharge", "Type": "string" }, { "Name": "total_amount", "Type": "string" }, { "Name": "payment_type", "Type": "string" } ]
  15. Choose Upload. After uploading the schema, the table schema should look like the following screenshot.

  16. Choose Submit to complete the creation of the table.

  17. Now, associate the Sensitive=True LF-Tag to the columns vendorid and fare_amount.

    1. On the Tables page, select the table you created (source_data_col_lvl).

    2. On the Actions menu, choose Schema.

    3. Select the column vendorid and choose Edit LF-Tags.

    4. For Assigned keys, choose Sensitive.

    5. For Values, choose True.

    6. Choose Save.

  18. Next, associate the Confidential=False LF-Tag to col_tag_database. This is required for lf-data-analyst to be able to describe the database col_tag_database when logged in from Amazon Athena.

    1. On the Databases page, find and select col_tag_database.

    2. On the Actions menu, choose Edit LF-Tags.

    3. Choose Assign new LF-Tag.

    4. For Assigned keys, choose the Confidential LF-Tag you created earlier.

    5. For Values, choose False.

    6. Choose Save.