Link Git-based repositories to an EMR Studio Workspace - Amazon EMR
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Link Git-based repositories to an EMR Studio Workspace

About Git repositories for EMR Studio

You can associate a maximum of three Git repositories with an EMR Studio Workspace. By default, each Workspace lets you choose from a list of Git repositories that are associated with the same Amazon account as the Studio. You can also create a new Git repository as a resource for a Workspace.

You can run Git commands like the following using a terminal command while connected to the primary node of a cluster.

!git pull origin <branch-name>

Alternatively, you can use the jupyterlab-git extension. Open it from the left sidebar by choosing the Git icon. For information about the jupyterlab-git extension for JupyterLab, see jupyterlab-git.

Prerequisites

To link an associated Git repository to a Workspace
  1. Open the Workspace that you want to link to a repository from the Workspaces list in the Studio.

  2. In the left sidebar, choose the Amazon EMR Git Repository icon to open the Git repository tool panel.

  3. Under Git repositories, expand the dropdown list and select a maximum of three repositories to link to the Workspace. EMR Studio registers your selection and begins linking each repository.

It might take some time for the linking process to complete. You can see the status for each repository that you selected in the Git repository tool panel. After EMR Studio links a repository to a Workspace, you should see the files that belong to that repository appear in the File browser panel.

To add a new Git repository to a Workspace as a resource
  1. Open the Workspace that you want to link to a repository from the Workspaces list in your Studio.

  2. In the left sidebar, choose the Amazon EMR Git Repository icon to open the Git repository tool panel.

  3. Choose Add new Git repository.

  4. For Repository name, enter a descriptive name for the repository in EMR Studio. Names may only contain alphanumeric characters, hyphens, and underscores.

  5. For Git repository URL, enter the URL for the repository. When you use a CodeCommit repository, this is the URL that is copied when you choose Clone URL and then Clone HTTPS. For example, https://git-codecommit.us-west-2.amazonaws.com/v1/repos/[MyCodeCommitRepoName].

  6. For Branch, enter the name of an existing branch that you want to check out.

  7. For Git credentials, choose an option according to the following guidelines. EMR Studio accesses your Git credentials using secrets stored in Secrets Manager.

    Note

    If you use a GitHub repository, we recommend that you use a personal access token (PAT) to authenticate. Beginning August 13, 2021, GitHub will require token-based authentication and will no longer accept passwords when authenticating Git operations. For more information, see the Token authentication requirements for Git operations post in The GitHub Blog.

    Option Description
    Create a new secret

    Choose this option to associate existing Git credentials with a new secret that will be created in Amazon Secrets Manager for you. Do one of the following based on the Git credentials that you use for the repository.

    If you use a Git user name and password to access the repository, select Username and password, enter the Secret name to use in Secrets Manager, and then enter the Username and Password to associate with the secret.

    –OR–

    If you use a personal access token to access the repository, select Personal access token (PAT), enter the Secret name to use in Secrets Manager, and then enter your personal access token. For more information, see Creating a personal access token for the command line for GitHub and Personal access tokens for Bitbucket. CodeCommit repositories do not support this option.

    Use a public repository without credentials Choose this option to access a public repository.
    Use an existing Amazon secret

    Choose this option if you already saved your credentials as a secret in Secrets Manager, and then select the secret name from the list.

    If you select a secret associated with a Git user name and password, the secret must be in the format {"gitUsername": "MyUserName", "gitPassword": "MyPassword"}.

  8. Choose Add repository to create the new repository. After EMR Studio creates the new repository, you will see a success message. The new repository appears in the dropdown list under Git repositories.

  9. To link the new repository to your Workspace, choose it from the dropdown list under Git repositories.

It might take some time for the linking process to complete. After EMR Studio links the new repository to the Workspace, you should see a new folder with the same name as your repository appear in the File Browser panel.

To open a different linked repository, navigate to its folder in the File browser.