Azure SQL connections
You can use Amazon Glue for Spark to read from and write to tables on Azure SQL Managed Instances in Amazon Glue 4.0 and later versions. You can define what to read from Azure SQL with a SQL query. You connect to Azure SQL using user and password credentials stored in Amazon Secrets Manager through a Amazon Glue connection.
For more information about Azure SQL, consult the Azure SQL documentation
Configuring Azure SQL connections
To connect to Azure SQL from Amazon Glue, you will need to create and store your Azure SQL credentials in a Amazon Secrets Manager secret, then associate that secret with a Azure SQL Amazon Glue connection.
To configure a connection to Azure SQL:
In Amazon Secrets Manager, create a secret using your Azure SQL credentials. To create a secret in Secrets Manager, follow the tutorial available in Create an Amazon Secrets Manager secret in the Amazon Secrets Manager documentation. After creating the secret, keep the Secret name,
secretName
for the next step.-
When selecting Key/value pairs, create a pair for the key
user
with the valueazuresqlUsername
. -
When selecting Key/value pairs, create a pair for the key
password
with the valueazuresqlPassword
.
-
In the Amazon Glue console, create a connection by following the steps in Adding an Amazon Glue connection. After creating the connection, keep the connection name,
connectionName
, for future use in Amazon Glue.When selecting a Connection type, select Azure SQL.
-
When providing Azure SQL URL, provide a JDBC endpoint URL.
The URL must be in the following format:
jdbc:sqlserver://
.databaseServerName
:databasePort
;databaseName=azuresqlDBname
;Amazon Glue requires the following URL properties:
databaseName
– A default database in Azure SQL to connect to.
For more information about JDBC URLs for Azure SQL Managed Instances, see the Microsoft documentation
. When selecting an Amazon Secret, provide
secretName
.
After creating a Amazon Glue Azure SQL connection, you will need to perform the following steps before running your Amazon Glue job:
Grant the IAM role associated with your Amazon Glue job permission to read
secretName
.In your Amazon Glue job configuration, provide
connectionName
as an Additional network connection.
Reading from Azure SQL tables
Prerequisites:
-
A Azure SQL table you would like to read from. You will need identification information for the table,
databaseName
andtableIdentifier
.An Azure SQL table is identified by its database, schema and table name. You must provide the database name and table name when connecting to Azure SQL. You also must provide the schema if it is not the default, "public". Database is provided through a URL property in
connectionName
, schema and table name through thedbtable
. -
A Amazon Glue Azure SQL connection configured to provide auth information. Complete the steps in the previous procedure, To configure a connection to Azure SQL to configure your auth information. You will need the name of the Amazon Glue connection,
connectionName
.
For example:
azuresql_read_table = glueContext.create_dynamic_frame.from_options( connection_type="azuresql", connection_options={ "connectionName": "
connectionName
", "dbtable": "tableIdentifier
" } )
You can also provide a SELECT SQL query, to filter the results returned to your DynamicFrame. You will need to configure query
.
For example:
azuresql_read_query = glueContext.create_dynamic_frame.from_options( connection_type="azuresql", connection_options={ "connectionName": "
connectionName
", "query": "query
" } )
Writing to Azure SQL tables
This example writes information from an existing DynamicFrame, dynamicFrame
to
Azure SQL. If the table already has information, Amazon Glue will append data from your DynamicFrame.
Prerequisites:
-
A Azure SQL table you would like to write to. You will need identification information for the table,
databaseName
andtableIdentifier
.An Azure SQL table is identified by its database, schema and table name. You must provide the database name and table name when connecting to Azure SQL. You also must provide the schema if it is not the default, "public". Database is provided through a URL property in
connectionName
, schema and table name through thedbtable
. -
Azure SQL auth information. Complete the steps in the previous procedure, To configure a connection to Azure SQL to configure your auth information. You will need the name of the Amazon Glue connection,
connectionName
.
For example:
azuresql_write = glueContext.write_dynamic_frame.from_options( connection_type="azuresql", connection_options={ "connectionName": "
connectionName
", "dbtable": "tableIdentifier
" } )
Azure SQL connection option reference
-
connectionName
— Required. Used for Read/Write. The name of a Amazon Glue Azure SQL connection configured to provide auth information to your connection method. -
databaseName
— Used for Read/Write. Valid Values: Azure SQL database names. The name of the database in Azure SQL to connect to. -
dbtable
— Required for writing, required for reading unlessquery
is provided. Used for Read/Write. Valid Values: Names of Azure SQL tables, or period separated schema/table name combinations. Used to specify the table and schema that identify the table to connect to. The default schema is "public". If your table is in a non-default schema, provide this information in the form
.schemaName
.tableName
-
query
— Used for Read. A Transact-SQL SELECT query defining what should be retrieved when reading from Azure SQL. For more information, see the Microsoft documentation.