Overview of using notebooks
Amazon Glue Studio allows you to interactively author jobs in a notebook interface based on Jupyter Notebooks. Through notebooks in Amazon Glue Studio, you can edit job scripts and view the output without having to run a full job, and you can edit data integration code and view the output without having to run a full job, and you can add markdown and save notebooks as .ipynb files and job scripts. You can start a notebook without installing software locally or managing servers. When you are satisfied with your code, Amazon Glue Studio can convert your notebook to a Glue job with the click of a button.
Some benefits of using notebooks include:
-
No cluster to provision or manage
-
No idle clusters to pay for
-
No up-front configuration required
-
No installation of Jupyter notebooks required
-
The same runtime/platform as Amazon Glue ETL
When you start a notebook through Amazon Glue Studio, all the configuration steps are done for you so that you can explore your data and start developing your job script after only a few seconds. Amazon Glue Studio configures a Jupyter notebook with the Amazon Glue Jupyter kernel. You don’t have to configure VPCs, network connections, or development endpoints to use this notebook.
To create jobs using the notebook interface:
-
configure the necessary IAM permissions.
-
start a notebook session to create a job
-
write code in the cells in the notebook
-
run and test the code to view the output
-
save the job
After your notebook is saved, your notebook is a full Amazon Glue job. You can manage all aspects of the job, such as scheduling jobs runs, setting job parameters, and viewing the job run history right along side your notebook.