Creating an ETL job using notebooks in Amazon Glue Studio
To start using notebooks in the Amazon Glue Studio console
-
Attach Amazon Identity and Access Management policies to the Amazon Glue Studio user and create an IAM role for your ETL job and notebook.
-
Configure additional IAM security for notebooks, as described in Granting permissions for the IAM role.
-
Open the Amazon Glue Studio console at https://console.amazonaws.cn/gluestudio/
. Note
Check that your browser does not block third-party cookies. Any browser that blocks third party cookies either by default or as a user-enabled setting will prevent notebooks from launching. For more information on managing cookies, see:
-
Choose the Jobs link in the left-side navigation menu.
-
Choose Jupyter notebook and then choose Create to start a new notebook session.
-
On the Create job in Jupyter notebook page, provide the job name, and choose the IAM role to use. Choose Create job.
After a short time period, the notebook editor appears.
-
After you add the code you must execute the cell to initiate a session. There are multiple ways to execute the cell:
Press the play button.
-
Use a keyboard shortcut:
-
On MacOS, Command + Enter to run the cell.
-
On Windows, Shift + Enter to run the cell.
-
For information about writing code using a Jupyter notebook interface, see The Jupyter Notebook User Documentation
. -
To test your script, run the entire script, or individual cells. Any command output will be displayed in the area beneath the cell.
-
After you have finished developing your notebook, you can save the job and then run it. You can find the script in the Script tab. Any magics you added to the notebook will be stripped away and won't be saved as part of the script of the generated Amazon Glue job. Amazon Glue Studio will auto-add a
job.commit()
to the end of your generated script from the notebook contents.For more information about running jobs, see Start a job run.