Step 3: Set up permissions for a Hudi table - Amazon Lake Formation
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Step 3: Set up permissions for a Hudi table

In this section, you'll learn how to create a Hudi table in the Amazon Glue Data Catalog, set up data permissions in Amazon Lake Formation, and query data using Amazon Athena.

To create a Hudi table

In this step, you’ll run an Amazon Glue job that creates an Hudi transactional table in the Data Catalog.

  1. Sign in to the Amazon Glue console at https://console.amazonaws.cn/glue/ in the US East (N. Virginia) Region

    as the data lake administrator user.

  2. Choose jobs from the left navigation pane.

  3. Select native-hudi-create.

  4. Under Actions, choose Edit job.

  5. Under Job details, expand Advanced properties, and check the box next to Use Amazon Glue Data Catalog as the Hive metastore to add the table metadata in the Amazon Glue Data Catalog. This specifies Amazon Glue Data Catalog as the metastore for the Data Catalog resources used in the job and enables Lake Formation permissions to be applied later on the catalog resources.

  6. Choose Save.

  7. Choose Run. You can view the status of the job while it is running.

    For more information on Amazon Glue jobs, see Working with jobs on the Amazon Glue console in the Amazon Glue Developer Guide.

    This job creates a Hudi(cow) table in the database:lfhudidb. Verify the product table in the Lake Formation console.

To register the data location with Lake Formation

Next, register an Amazon S3 path as the root location of your data lake.

  1. Sign in to the Lake Formation console at https://console.amazonaws.cn/lakeformation/ as the data lake administrator user.

  2. In the navigation pane, under Register and ingest, choose Data location.

  3. On the upper right of the console, choose Register location.

  4. On the Register location page, enter the following:

    • Amazon S3 path – Choose Browse and select lf-otf-datalake-123456789012. Click on the right arrow (>) next to the Amazon S3 root location to navigate to the s3/buckets/lf-otf-datalake-123456789012/transactionaldata/native-hudi location.

    • IAM role – Choose LF-OTF-RegisterRole as the IAM role.

    • Choose Register location.

To grant data lake permissions on the Hudi table

In this step, we'll grant data lake permissions to the business analyst user.

  1. Under Data lake permissions, choose Grant.

  2. On the Grant data permissions screen, choose, IAM users and roles.

  3. lf-consumer-analystuser from the drop down.

  4. Choose Named data catalog resource.

  5. For Databases choose lfhudidb.

  6. For Tables, choose product.

  7. Next, you can grant column-based access by specifying columns.

    1. Under Table permissions, choose Select.

    2. Under Data permissions, choose Column-based access, choose Include columns.

    3. Choose product_name, price, and category columns.

    4. Choose Grant.

To query the Hudi table using Athena

Now start querying the Hudi table you created using Athena. If it is your first time running queries in Athena, you need to configure a query result location. For more information, see Specifying a query result location.

  1. Sign out as the data lake administrator user and sign in as lf-consumer-analystuser in US East (N. Virginia) Region using the password noted earlier from the Amazon CloudFormation output.

  2. Open the Athena console at https://console.amazonaws.cn/athena/.

  3. Choose Settings and select Manage.

  4. In the Location of query result box, enter the path to the bucket that you created in Amazon CloudFormation outputs. Copy the value of AthenaQueryResultLocation (s3://lf-otf-tutorial-123456789012/athena-results/) and Save.

  5. Run the following query to preview 10 records stored in the Hudi table:

    select * from lfhudidb.product limit 10;

    For more information on querying Hudi tables, see the Querying Hudi tables section in the Amazon Athena User Guide.