Teradata Vantage connections
You can use Amazon Glue for Spark to read from and write to existing tables in Teradata Vantage in Amazon Glue 4.0 and later versions. You can define what to read from Teradata with a SQL query. You can connect to Teradata using username and password credentials stored in Amazon Secrets Manager through a Amazon Glue connection.
For more information about Teradata, consult the Teradata documentation
Configuring Teradata connections
To connect to Teradata from Amazon Glue, you will need to create and store your Teradata credentials in an Amazon Secrets Manager secret, then associate that secret with a Amazon Glue Teradata connection. If your Teradata instance is in an Amazon VPC, you will also need to provide networking options to your Amazon Glue Teradata connection.
To connect to Teradata from Amazon Glue, you may need some prerequisites:
-
If you are accessing your Teradata environment through Amazon VPC, configure Amazon VPC to allow your Amazon Glue job to communicate with the Teradata environment. We discourage accessing the Teradata environment over the public internet.
In Amazon VPC, identify or create a VPC, Subnet and Security group that Amazon Glue will use while executing the job. Additionally, you need to ensure Amazon VPC is configured to permit network traffic between your Teradata instance and this location. Your job will need to establish a TCP connection with your Teradata client port. For more information about Teradata ports, see the Teradata documentation
. Based on your network layout, secure VPC connectivity may require changes in Amazon VPC and other networking services. For more information about Amazon connectivity, consult Amazon Connectivity Options
in the Teradata documentation.
To configure a Amazon Glue Teradata connection:
In your Teradata configuration, identify or create a user and password Amazon Glue will connect with,
teradataUser
andteradataPassword
. For more information, consult Vantage Security Overviewin the Teradata documentation. In Amazon Secrets Manager, create a secret using your Teradata credentials. To create a secret in Secrets Manager, follow the tutorial available in Create an Amazon Secrets Manager secret in the Amazon Secrets Manager documentation. After creating the secret, keep the Secret name,
secretName
for the next step.-
When selecting Key/value pairs, create a pair for the key
user
with the valueteradataUsername
. -
When selecting Key/value pairs, create a pair for the key
password
with the valueteradataPassword
.
-
In the Amazon Glue console, create a connection by following the steps in Adding an Amazon Glue connection. After creating the connection, keep the connection name,
connectionName
, for the next step.When selecting a Connection type, select Teradata.
When providing JDBC URL, provide the URL for your instance. You can also hardcode certain comma separated connection parameters in your JDBC URL. The URL must conform to the following format:
jdbc:teradata://
teradataHostname
/ParameterName
=ParameterValue
,ParameterName
=ParameterValue
Supported URL parameters include:
DATABASE
– name of database on host to access by default.DBS_PORT
– the database port, used when running on a nonstandard port.
When selecting a Credential type, select Amazon Secrets Manager, then set Amazon Secret to
secretName
.
-
In the following situations, you may require additional configuration:
-
For Teradata instances hosted on Amazon in an Amazon VPC
-
You will need to provide Amazon VPC connection information to the Amazon Glue connection that defines your Teradata security credentials. When creating or updating your connection, set VPC, Subnet and Security groups in Network options.
-
-
After creating a Amazon Glue Teradata connection, you will need to perform the following steps before calling your connection method.
Grant the IAM role associated with your Amazon Glue job permission to read
secretName
.In your Amazon Glue job configuration, provide
connectionName
as an Additional network connection.
Reading from Teradata
Prerequisites:
A Teradata table you would like to read from. You will need the table name,
tableName
.-
A Amazon Glue Teradata connection configured to provide auth information. Complete the steps To configure a connection to Teradata to configure your auth information. You will need the name of the Amazon Glue connection,
connectionName
.
For example:
teradata_read_table = glueContext.create_dynamic_frame.from_options( connection_type="teradata", connection_options={ "connectionName": "
connectionName
", "dbtable": "tableName
" } )
You can also provide a SELECT SQL query, to filter the results returned to your DynamicFrame. You will need to configure query
.
For example:
teradata_read_query = glueContext.create_dynamic_frame.from_options( connection_type="teradata", connection_options={ "connectionName": "
connectionName
", "query": "query
" } )
Writing to Teradata tables
Prerequisites: A Teradata table you would like to write to, tableName
. You must create the
table before calling the connection method.
For example:
teradata_write = glueContext.write_dynamic_frame.from_options( connection_type="teradata", connection_options={ "connectionName": "
connectionName
", "dbtable": "tableName
" } )
Teradata connection option reference
-
connectionName
— Required. Used for Read/Write. The name of a Amazon Glue Teradata connection configured to provide auth and networking information to your connection method. -
dbtable
— Required for writing, required for reading unlessquery
is provided. Used for Read/Write. The name of a table your connection method will interact with. -
query
— Used for Read. A SELECT SQL query defining what should be retrieved when reading from Teradata.