Considerations for Amazon EMR with Lake Formation - Amazon EMR
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Considerations for Amazon EMR with Lake Formation

Consider the following when using Amazon EMR with Amazon Lake Formation.

  • Table-level access control is available on clusters with Amazon EMR releases 6.13 and higher.

  • Fine-grained access control at row, column, and cell level is available on clusters with Amazon EMR releases 6.15 and higher.

  • Users with access to a table can access all the properties of that table. If you have Lake Formation based access control on a table, review the table to make sure that the properties don't contain any sensitive data or information.

  • Amazon EMR clusters with Lake Formation don't support Spark's fallback to HDFS when Spark collects table statistics. This ordinarily helps optimize query performance.

  • Operations that support access controls based on Lake Formation with non-governed Apache Spark tables include INSERT INTO and INSERT OVERWRITE.

  • Operations that support access controls based on Lake Formation with Apache Spark and Apache Hive include SELECT, DESCRIBE, SHOW DATABASE, SHOW TABLE, SHOW COLUMN, and SHOW PARTITION.

  • Amazon EMR doesn't support access control to the following Lake Formation based operations:

    • Writes to governed tables

    • Amazon EMR doesn't support CREATE TABLE. Amazon EMR 6.10.0 and higher supports ALTER TABLE.

    • DML statements other than INSERT commands.

  • There are performance differences between the same query with and without Lake Formation based access control.