Using interactive sessions with SageMaker notebooks - Amazon Glue
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China.

Using interactive sessions with SageMaker notebooks

To run use interactive sessions with SageMaker notebooks follow the steps to create a new lifecycle configuration and SageMaker Notebook instance.

Installing on an existing SageMaker notebook instance

Note

Installing Interactive Sessions on an existing SageMaker notebook instance is not currently supported. Although lifecycle configuration can be run on an existing SageMaker notebook instance, installing interactive sessions on an existing SageMaker notebook instance will modify the kernelspec manager class which may cause issues with other workloads.

Create a new lifecycle configuration and SageMaker notebook instance with interactive sessions

SageMaker console setup

  1. Create the Lifecycle Configuration.

    1. Open the Lifecycle configurations page on the SageMaker console.

      
                    The screenshot shows the Lifecycle configurations option selected.
    2. Choose Create Configuration.

    3. In the Name field, type 'AWSGlueInteractiveSessionsPreview'.

      
                    The screenshot shows the Create lifecycle configuration page.
    4. In the Scripts section, choose the Create notebook tab. Copy and paste the following into the text box and choose Create configuration.

      #!/bin/bash set -ex sudo -u ec2-user -i <<'EOF' ANACONDA_DIR=/home/ec2-user/anaconda3 # Create and Activate Conda Env echo "Creating glue_pyspark conda enviornment" conda create --name glue_pyspark python=3.7 ipykernel jupyter nb_conda -y echo "Activating glue_pyspark" source activate glue_pyspark # Install Glue Sessions to Env echo "Installing AWS Glue Sessions with pip" pip install aws-glue-sessions # Clone glue_pyspark to glue_scala. This is required because I had to match kernel naming conventions to their environments and couldn't have two kernels in one conda env. echo "Cloning glue_pyspark to glue_scala" conda create --name glue_scala --clone glue_pyspark # Remove python3 kernel from glue_pyspark rm -r ${ANACONDA_DIR}/envs/glue_pyspark/share/jupyter/kernels/python3 rm -r ${ANACONDA_DIR}/envs/glue_scala/share/jupyter/kernels/python3 # Copy kernels to Jupyter kernel env (Discoverable by conda_nb_kernel) echo "Copying Glue PySpark Kernel" cp -r ${ANACONDA_DIR}/envs/glue_pyspark/lib/python3.7/site-packages/aws_glue_interactive_sessions_kernel/glue_pyspark/ ${ANACONDA_DIR}/envs/glue_pyspark/share/jupyter/kernels/glue_pyspark/ echo "Copying Glue Spark Kernel" mkdir ${ANACONDA_DIR}/envs/glue_scala/share/jupyter/kernels cp -r ${ANACONDA_DIR}/envs/glue_scala/lib/python3.7/site-packages/aws_glue_interactive_sessions_kernel/glue_spark/ ${ANACONDA_DIR}/envs/glue_scala/share/jupyter/kernels/glue_spark/ echo "Changing Jupyter kernel manager from EnvironmentKernelSpecManager to CondaKernelSpecManager" JUPYTER_CONFIG=/home/ec2-user/.jupyter/jupyter_notebook_config.py sed -i '/EnvironmentKernelSpecManager/ s/^/#/' ${JUPYTER_CONFIG} echo "c.CondaKernelSpecManager.name_format='conda_{environment}'" >> ${JUPYTER_CONFIG} echo "c.CondaKernelSpecManager.env_filter='anaconda3$|JupyterSystemEnv$|/R$'" >> ${JUPYTER_CONFIG} EOF systemctl restart jupyter-server
  2. Create SageMaker Notebook instance with new lifecycle configuration.

    1. Choose Notebook Instances

      
                    The screenshot shows the Notebook instances option selected.
    2. Choose Create notebook instance.

    3. Enter a name in the 'Notebook instance name' field.

      
                    The screenshot shows the Create notebook instance page.
    4. Choose 'notebook-al2-v1' as the Platform identifier.

    5. Select Additional configuration to expose the Lifecycle configuration - optional drop-down menu.

    6. Select Amazon GlueInteractiveSessionsPreview.

    7. In the Permissions and encryption section, choose an IAM role that has the permissions needed to run Amazon Glue interactive sessions. For more information, see Securing Amazon Glue interactive sessions with IAM

    8. Complete selection of additional options as needed. When done, choose Create notebook instance.

  3. Start an Interactive Session on your SageMaker Notebook instance.

    1. After the notebook has been provisioned, choose Jupyter Lab on the Notebook instances page.

    2. Select the conda_glue_pyspark or conda_glue_scala icons in the Notebook section to create a new notebook.

      
                    The screenshot shows the Create notebook instance page.
  4. Configure your Amazon Glue session by running magics in the cells before your code. To find all available magics, run %help in your first cell. Run %iam_role if you have not configured it in ~/.aws/configure.

  5. For more information on configuring and using interactive sessions, see Configuring Amazon Glue interactive sessions.