Azure SQL connections - Amazon Glue
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Azure SQL connections

You can use Amazon Glue for Spark to read from and write to tables on Azure SQL Managed Instances in Amazon Glue 4.0 and later versions. You can define what to read from Azure SQL with a SQL query. You connect to Azure SQL using user and password credentials stored in Amazon Secrets Manager through a Amazon Glue connection.

For more information about Azure SQL, consult the Azure SQL documentation.

Configuring Azure SQL connections

To connect to Azure SQL from Amazon Glue, you will need to create and store your Azure SQL credentials in a Amazon Secrets Manager secret, then associate that secret with a Azure SQL Amazon Glue connection.

To configure a connection to Azure SQL:
  1. In Amazon Secrets Manager, create a secret using your Azure SQL credentials. To create a secret in Secrets Manager, follow the tutorial available in Create an Amazon Secrets Manager secret in the Amazon Secrets Manager documentation. After creating the secret, keep the Secret name, secretName for the next step.

    • When selecting Key/value pairs, create a pair for the key user with the value azuresqlUsername.

    • When selecting Key/value pairs, create a pair for the key password with the value azuresqlPassword.

  2. In the Amazon Glue console, create a connection by following the steps in Adding an Amazon Glue connection. After creating the connection, keep the connection name, connectionName, for future use in Amazon Glue.

    • When selecting a Connection type, select Azure SQL.

    • When providing Azure SQL URL, provide a JDBC endpoint URL.

      The URL must be in the following format: jdbc:sqlserver://databaseServerName:databasePort;databaseName=azuresqlDBname;.

      Amazon Glue requires the following URL properties:

      • databaseName – A default database in Azure SQL to connect to.

      For more information about JDBC URLs for Azure SQL Managed Instances, see the Microsoft documentation.

    • When selecting an Amazon Secret, provide secretName.

After creating a Amazon Glue Azure SQL connection, you will need to perform the following steps before running your Amazon Glue job:

  • Grant the IAM role associated with your Amazon Glue job permission to read secretName.

  • In your Amazon Glue job configuration, provide connectionName as an Additional network connection.

Reading from Azure SQL tables

Prerequisites:

  • A Azure SQL table you would like to read from. You will need identification information for the table, databaseName and tableIdentifier.

    An Azure SQL table is identified by its database, schema and table name. You must provide the database name and table name when connecting to Azure SQL. You also must provide the schema if it is not the default, "public". Database is provided through a URL property in connectionName , schema and table name through the dbtable.

  • A Amazon Glue Azure SQL connection configured to provide auth information. Complete the steps in the previous procedure, To configure a connection to Azure SQL to configure your auth information. You will need the name of the Amazon Glue connection, connectionName.

For example:

azuresql_read_table = glueContext.create_dynamic_frame.from_options( connection_type="azuresql", connection_options={ "connectionName": "connectionName", "dbtable": "tableIdentifier" } )

You can also provide a SELECT SQL query, to filter the results returned to your DynamicFrame. You will need to configure query.

For example:

azuresql_read_query = glueContext.create_dynamic_frame.from_options( connection_type="azuresql", connection_options={ "connectionName": "connectionName", "query": "query" } )

Writing to Azure SQL tables

This example writes information from an existing DynamicFrame, dynamicFrame to Azure SQL. If the table already has information, Amazon Glue will append data from your DynamicFrame.

Prerequisites:

  • A Azure SQL table you would like to write to. You will need identification information for the table, databaseName and tableIdentifier.

    An Azure SQL table is identified by its database, schema and table name. You must provide the database name and table name when connecting to Azure SQL. You also must provide the schema if it is not the default, "public". Database is provided through a URL property in connectionName , schema and table name through the dbtable.

  • Azure SQL auth information. Complete the steps in the previous procedure, To configure a connection to Azure SQL to configure your auth information. You will need the name of the Amazon Glue connection, connectionName.

For example:

azuresql_write = glueContext.write_dynamic_frame.from_options( connection_type="azuresql", connection_options={ "connectionName": "connectionName", "dbtable": "tableIdentifier" } )

Azure SQL connection option reference

  • connectionName — Required. Used for Read/Write. The name of a Amazon Glue Azure SQL connection configured to provide auth information to your connection method.

  • databaseName — Used for Read/Write. Valid Values: Azure SQL database names. The name of the database in Azure SQL to connect to.

  • dbtable — Required for writing, required for reading unless query is provided. Used for Read/Write. Valid Values: Names of Azure SQL tables, or period separated schema/table name combinations. Used to specify the table and schema that identify the table to connect to. The default schema is "public". If your table is in a non-default schema, provide this information in the form schemaName.tableName.

  • query — Used for Read. A Transact-SQL SELECT query defining what should be retrieved when reading from Azure SQL. For more information, see the Microsoft documentation.