Building Amazon Glue jobs with interactive sessions
Data engineers can author Amazon Glue jobs faster and more easily than before using interactive sessions in Amazon Glue.
Topics
- Overview of Amazon Glue interactive sessions
- Getting started with Amazon Glue interactive sessions
- Configuring Amazon Glue interactive sessions for Jupyter and Amazon Glue Studio notebooks
- Getting started with Amazon Glue for Ray interactive sessions (preview)
- Converting a script or notebook into an Amazon Glue job
- Working with streaming operations in Amazon Glue interactive sessions
- Developing and testing Amazon Glue job scripts locally
- Development endpoints
Overview of Amazon Glue interactive sessions
With Amazon Glue interactive sessions, you can rapidly build, test, and run data preparation and analytics applications. Interactive sessions provides a programmatic and visual interface for building and testing extract, transform, and load (ETL) scripts for data preparation. Interactive sessions run Apache Spark analytics applications and provide on-demand access to a remote Spark runtime environment. Amazon Glue transparently manages serverless Spark for these interactive sessions.
Interactive sessions are flexible, so you build and test your applications from the environment of your choice. You can create and work with interactive sessions through the Amazon Command Line Interface and the API. You can use Jupyter-compatible notebooks to visually author and test your notebook scripts. Interactive sessions provide an open-source Jupyter kernel that integrates almost anywhere that Jupyter does, including integrating with IDEs such as PyCharm, IntelliJ, and VS Code. This enables you to author code in your local environment and run it seamlessly on the interactive sessions backend.
Using the interactive sessions API, customers can programmatically run applications that use Apache Spark analytics without having to manage Spark infrastructure. You can run one or more Spark statements within a single interactive session.
Interactive sessions therefore provide a faster, cheaper, more-flexible way to build and run data preparation and analytics applications. To learn how to use interactive sessions, see the documentation in this section. Magics supported by Amazon Glue
Limitations
-
Job bookmarks are not supported in interactive sessions.
-
Creating notebook jobs using the Amazon Command Line Interface is not supported.
-
Amazon Glue Studio notebooks do not support Scala.