Considerations when using Zeppelin on Amazon EMR - Amazon EMR
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Considerations when using Zeppelin on Amazon EMR

  • Connect to Zeppelin using the same SSH tunneling method to connect to other web servers on the master node. Zeppelin server is found at port 8890.

  • Zeppelin on Amazon EMR release versions 5.0.0 and later supports Shiro authentication.

  • Zeppelin on Amazon EMR release versions 5.8.0 and later supports using Amazon Glue Data Catalog as the metastore for Spark SQL. For more information, see Using Amazon Glue Data Catalog as the metastore for Spark SQL.

  • Zeppelin does not use some of the settings defined in your cluster's spark-defaults.conf configuration file, even though it instructs YARN to allocate executors dynamically if you have set spark.dynamicAllocation.enabled to true. You must set executor settings, such as memory and cores, using the Zeppelin Interpreter tab, and then restart the interpreter for them to be used.

  • Amazon EMR releases 6.10.0 and higher support Apache Zeppelin integration with Apache Flink. See Working with Flink jobs from Zeppelin in Amazon EMR for more information.

  • Zeppelin on Amazon EMR does not support the SparkR interpreter.