Getting started with Amazon Glue interactive sessions - Amazon Glue
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China.

Getting started with Amazon Glue interactive sessions

These sections describe how to run Amazon Glue interactive sessions locally.

Prerequisites for setting up interactive sessions locally

The following are prerequisites for installing interactive sessions:

  • Python 3.6 or later

  • See sections below for MacOS/Linux and Windows instructions.

MacOS/Linux instructions

Installing Jupyter and Amazon Glue interactive sessions Jupyter kernels

  1. Install jupyter boto3 and aws-glue-sessions with pip. Jupyter Lab is also compatible and can be installed instead.

    pip3 install --upgrade jupyter boto3 aws-glue-sessions
  2. The following commands use pip to identify the installation location for aws-glue-sessions. The associated botocore then installs the Jupyter kernels.

    SITE_PACKAGES=$(pip3 show aws-glue-sessions | grep Location | awk '{print $2}') jupyter kernelspec install $SITE_PACKAGES/aws_glue_interactive_sessions_kernel/glue_pyspark jupyter kernelspec install $SITE_PACKAGES/aws_glue_interactive_sessions_kernel/glue_spark

Configuring session credentials and region

Amazon Glue interactive sessions requires the same IAM permissions as Amazon Glue Jobs and Dev Endpoints. Specify the role used with interactive sessions in one of two ways:

  1. With the %iam_role and %region magics

  2. With an additional line in ~/.aws/config

Configuring a session role with magic

In the first cell, type %iam_role <YourGlueServiceRole> in the first cell executed.

Configuring a session role with ~/.aws/config

Amazon Glue Service Role for interactive sessions can either be specified in the notebook itself or stored alongside the Amazon CLI config. If you have a role you typically use with Amazon Glue Jobs this will be that role. If you do not have a role you use for Amazon Glue jobs, please follow this guide, Setting up IAM permissions for Amazon Glue , to set one up.

To set this role as the default role for interactive sessions:

  1. With a text editor, open ~/.aws/credentials.

  2. Look for the profile you use for Amazon Glue. If you don't use a profile, use the [Default] profile.

  3. Add a line in the profile for the role you intend to use like glue_role_arn=<AWSGlueServiceRole>.

  4. [Optional]: If your profile does not have a default region set, I recommend adding one with region=us-east-1, replacing us-east-1 with your desired region.

  5. Save the config.

For more information, see Interactive sessions with IAM.

Running Jupyter notebook

To run Jupyter notebook, complete the following steps.

  1. Run the following command to launch Jupyter Notebook.

    jupyter notebook
  2. Choose New, and then choose one of the Amazon Glue kernels to begin coding against Amazon Glue.

Windows instructions

Installing Jupyter and Amazon Glue interactive sessions kernels

  1. Use pip to install Jupyter. Jupyter Lab is also compatible and can be installed instead.

    pip3 install --upgrade jupyter boto3 aws-glue-sessions
  2. (Optional) Run the following command to list the installed packages. If jupyter and aws-glue-sessions were successfully installed, you should see a long list of packages, including jupyter 1.0.0 (or later).

    pip3 list
  3. Install the sessions kernels into Jupyter by running the following commands. These commands will look up the installation location for aws-glue-sessions from pip and install the Jupyter kernels therein.

    1. Change the directory to the aws-glue-sessions install directory within python's site-packages directory.

      Windows PowerShell:

      cd ((pip3 show aws-glue-sessions | Select-String Location | % {$_ -replace("Location: ","")})+"\aws_glue_interactive_sessions_kernel")
    2. Install the Amazon Glue PySpark and Amazon Glue Scala kernels.

      jupyter-kernelspec install glue_pyspark
      jupyter-kernelspec install glue_spark

Configuring session credentials and region

Amazon Glue interactive sessions requires the same IAM permissions as Amazon Glue Jobs and Dev Endpoints. Specify the role used with interactive sessions in one of two ways:

  1. With the %iam_role and %region magics

  2. With an additional line in ~/.aws/config

Configuring a session role with magic

In the first cell, type %iam_role <YourGlueServiceRole> in the first cell executed.

Configuring a session role with ~/.aws/config

Amazon Glue Service Role for interactive sessions can either be specified in the notebook itself or stored alongside the Amazon CLI config. If you have a role you typically use with Amazon Glue Jobs this will be that role. If you do not have a role you use for Amazon Glue jobs, please follow this guide, Setting up IAM permissions for Amazon Glue , to set one up.

To set this role as the default role for interactive sessions:

  1. With a text editor, open ~/.aws/credentials.

  2. Look for the profile you use for Amazon Glue. If you don't use a profile, use the [Default] profile.

  3. Add a line in the profile for the role you intend to use like glue_role_arn=<AWSGlueServiceRole>.

  4. [Optional]: If your profile does not have a default region set, I recommend adding one with region=us-east-1, replacing us-east-1 with your desired region.

  5. Save the config.

For more information, see Interactive sessions with IAM.

Running Jupyter

To run Jupyter Notebook, complete the following steps.

  1. Run the following command to launch Jupyter Notebook.

    jupyter notebook
  2. Choose New, and then choose one of the Amazon Glue kernels to begin coding against Amazon Glue.

Upgrading from the interactive sessions preview

The kernel was upgraded with new names when it was released with version 0.27. To clean up preview versions of the kernels run the following from a terminal or PowerShell.

Note

If you are a part of any other Amazon Glue preview that requires a custom service model, removing the kernel will remove the custom service model.

# Remove Old Glue Kernels jupyter kernelspec remove glue_python_kernel jupyter kernelspec remove glue_scala_kernel # Remove Custom Model cd ~/.aws/models rm -rf glue/