

# Connecting to MongoDB in Amazon Glue Studio
Connecting to MongoDB

 Amazon Glue provides built-in support for MongoDB. Amazon Glue Studio provides a visual interface to connect to MongoDB, author data integration jobs, and run them on the Amazon Glue Studio serverless Spark runtime. 

**Topics**
+ [

# Creating a MongoDB connection
](creating-mongodb-connection.md)
+ [

# Creating a MongoDB source node
](creating-mongodb-source-node.md)
+ [

# Creating a MongoDB target node
](creating-mongodb-target-node.md)
+ [

## Advanced options
](#creating-mongodb-connection-advanced-options)

# Creating a MongoDB connection


**Prerequisites**:
+ If your MongoDB instance is in an Amazon VPC, configure Amazon VPC to allow your Amazon Glue job to communicate with the MongoDB instance without traffic traversing the public internet. 

  In Amazon VPC, identify or create a **VPC**, **Subnet** and **Security group** that Amazon Glue will use while executing the job. Additionally, you need to ensure Amazon VPC is configured to permit network traffic between your MongoDB instance and this location. Based on your network layout, this may require changes to security group rules, Network ACLs, NAT Gateways and Peering connections.

**To configure a connection to MongoDB:**

1. Optionally, in Amazon Secrets Manager, create a secret using your MongoDB credentials. To create a secret in Secrets Manager, follow the tutorial available in [ Create an Amazon Secrets Manager secret ](https://docs.amazonaws.cn//secretsmanager/latest/userguide/create_secret.html) in the Amazon Secrets Manager documentation. After creating the secret, keep the Secret name, *secretName* for the next step. 
   + When selecting **Key/value pairs**, create a pair for the key `username` with the value *mongodbUser*.

     When selecting **Key/value pairs**, create a pair for the key `password` with the value *mongodbPass*.

1. In the Amazon Glue console, create a connection by following the steps in [Adding an Amazon Glue connection](console-connections.md). After creating the connection, keep the connection name, *connectionName*, for future use in Amazon Glue. 
   + When selecting a **Connection type**, select **MongoDB** or **MongoDB Atlas**.
   + When selecting **MongoDB URL** or **MongoDB Atlas URL**, provide the hostname of your MongoDB instance.

     A MongoDB URL is provided in the format `mongodb://mongoHost:mongoPort/mongoDBname`.

     A MongoDB Atlas URL is provided in the format `mongodb+srv://mongoHost/mongoDBname`.
   + If you chose to create an Secrets Manager secret, choose the Amazon Secrets Manager **Credential type**.

     Then, in **Amazon Secret** provide *secretName*.
   + If you choose to provide **Username and password**, provide *mongodbUser* and *mongodbPass*.

1. In the following situations, you may require additional configuration:
   + 

     For MongoDB instances hosted on Amazon in an Amazon VPC
     + You will need to provide Amazon VPC connection information to the Amazon Glue connection that defines your MongoDB security credentials. When creating or updating your connection, set **VPC**, **Subnet** and **Security groups** in **Network options**.

After creating a Amazon Glue MongoDB connection, you will need to perform the following steps before running your Amazon Glue job:
+ When working with Amazon Glue jobs in the visual editor, you must provide Amazon VPC connection information for your job to connect to MongoDB. Identify a suitable location in Amazon VPC and provide it to your Amazon Glue MongoDB connection.
+ If you chose to create an Secrets Manager secret, grant the IAM role associated with your Amazon Glue job permission to read *secretName*.

# Creating a MongoDB source node


## Prerequisites needed

+ A Amazon Glue MongoDB connection, as described in the previous section, [Creating a MongoDB connection](creating-mongodb-connection.md).
+ If you chose to create an Secrets Manager secret, appropriate permissions on your job to read the secret used by the connection.
+ A MongoDB collection you would like to read from. You will need identification information for the collection.

  A MongoDB collection is identified by a database name and a collection name, *mongodbName*, *mongodbCollection*.

## Adding a MongoDB data source


**To add a **Data source – MongoDB** node:**

1.  Choose the connection for your MongoDB data source. Since you have created it, it should be available in the dropdown. If you need to create a connection, choose **Create MongoDB connection**. For more information see the previous section, [Creating a MongoDB connection](creating-mongodb-connection.md). 

    Once you have chosen a connection, you can view the connection properties by clicking **View properties**. 

1. Choose a **Database**. Enter *mongodbName*.

1. Choose a **Collection**. Enter *mongodbCollection*.

1. Choose your **Partitioner**, **Partition size (MB)** and **Partition key**. For more information about partition parameters, see ["connectionType": "mongodb" as source](aws-glue-programming-etl-connect-mongodb-home.md#etl-connect-mongodb-as-source).

1.  In **Custom MongoDB properties**, enter parameters and values as needed. 

# Creating a MongoDB target node


## Prerequisites needed

+ A Amazon Glue MongoDB connection, configured with an Amazon Secrets Manager secret, as described in the previous section, [Creating a MongoDB connection](creating-mongodb-connection.md).
+ Appropriate permissions on your job to read the secret used by the connection.
+ A MongoDB table you would like to write to, *tableName*.

## Adding a MongoDB data target


**To add a **Data target – MongoDB** node:**

1.  Choose the connection for your MongoDB data source. Since you have created it, it should be available in the dropdown. If you need to create a connection, choose **Create MongoDB connection**. For more information see the previous section, [Creating a MongoDB connection](creating-mongodb-connection.md). 

    Once you have chosen a connection, you can view the connection properties by clicking **View properties**. 

1. Choose a **Database**. Enter *mongodbName*.

1. Choose a **Collection**. Enter *mongodbCollection*.

1. Choose your **Partitioner**, **Partition size (MB)** and **Partition key**. For more information about partition parameters, see ["connectionType": "mongodb" as source](aws-glue-programming-etl-connect-mongodb-home.md#etl-connect-mongodb-as-source).

1. Choose **Retry Writes** if desired.

1.  In **Custom MongoDB properties**, enter parameters and values as needed. 

## Advanced options


You can provide advanced options when creating a MongoDB node. These options are the same as those available when programming Amazon Glue for Spark scripts.

See [MongoDB connection option reference](aws-glue-programming-etl-connect-mongodb-home.md#aws-glue-programming-etl-connect-mongodb). 