Using a Microsoft SQL Server database as a source for Amazon DMS
Migrate data from one or many Microsoft SQL Server databases using Amazon DMS. With a SQL Server database as a source, you can migrate data to another SQL Server database, or to one of the other Amazon DMS supported databases.
For information about versions of SQL Server that Amazon DMS supports as a source, see Sources for Amazon DMS.
The source SQL Server database can be installed on any computer in your network. A SQL Server account with appropriate access privileges to the source database for the type of task you chose is required for use with Amazon DMS. For more information, see Permissions for SQL Server tasks.
Amazon DMS supports migrating data from named instances of SQL Server. You can use the following notation in the server name when you create the source endpoint.
IPAddress\InstanceName
For example, the following is a correct source endpoint server name. Here, the first part of the name is the IP address of the server, and the second part is the SQL Server instance name (in this example, SQLTest).
10.0.0.25\SQLTest
Also, obtain the port number that your named instance of SQL Server listens on, and use it to configure your Amazon DMS source endpoint.
Note
Port 1433 is the default for Microsoft SQL Server. But dynamic ports that change each time SQL Server is started, and specific static port numbers used to connect to SQL Server through a firewall are also often used. So, you want to know the actual port number of your named instance of SQL Server when you create the Amazon DMS source endpoint.
You can use SSL to encrypt connections between your SQL Server endpoint and the replication instance. For more information on using SSL with a SQL Server endpoint, see Using SSL with Amazon Database Migration Service.
You can use CDC for ongoing migration from a SQL Server database. For information about configuring your source SQL server database for CDC, see Capturing data changes for ongoing replication from SQL Server.
For additional details on working with SQL Server source databases and Amazon DMS, see the following.
Topics
- Limitations on using SQL Server as a source for Amazon DMS
- Permissions for SQL Server tasks
- Prerequisites for using ongoing replication (CDC) from a SQL Server source
- Supported compression methods for SQL Server
- Working with self-managed SQL Server AlwaysOn availability groups
- Endpoint settings when using SQL Server as a source for Amazon DMS
- Source data types for SQL Server
- Capturing data changes for ongoing replication from SQL Server
Limitations on using SQL Server as a source for Amazon DMS
The following limitations apply when using a SQL Server database as a source for Amazon DMS:
-
The identity property for a column isn't migrated to a target database column.
-
The SQL Server endpoint doesn't support the use of tables with sparse columns.
-
Windows Authentication isn't supported.
-
Changes to computed fields in a SQL Server aren't replicated.
-
Temporal tables aren't supported.
-
SQL Server partition switching isn't supported.
-
When using the WRITETEXT and UPDATETEXT utilities, Amazon DMS doesn't capture events applied on the source database.
-
The following data manipulation language (DML) pattern isn't supported.
SELECT * INTO
new_table
FROMexisting_table
-
When using SQL Server as a source, column-level encryption isn't supported.
-
Amazon DMS doesn't support server level audits on SQL Server 2008 or SQL Server 2008 R2 as sources. This is because of a known issue with SQL Server 2008 and 2008 R2. For example, running the following command causes Amazon DMS to fail.
USE [master] GO ALTER SERVER AUDIT [my_audit_test-20140710] WITH (STATE=on) GO
-
Geometry columns aren't supported in full lob mode when using SQL Server as a source. Instead, use limited lob mode or set the
InlineLobMaxSize
task setting to use inline lob mode. -
When using a Microsoft SQL Server source database in a replication task, the SQL Server Replication Publisher definitions aren't removed if you remove the task. A Microsoft SQL Server system administrator must delete those definitions from Microsoft SQL Server.
-
Migrating data from schema-bound and non-schema-bound views is supported for full-load only tasks.
-
Renaming tables using sp_rename isn't supported (for example,
sp_rename 'Sales.SalesRegion', 'SalesReg;)
-
Renaming columns using sp_rename isn't supported (for example,
sp_rename 'Sales.Sales.Region', 'RegID', 'COLUMN';
) Amazon DMS doesn't support change processing to set and unset column default values (using the
ALTER COLUMN SET DEFAULT
clause withALTER TABLE
statements).-
Amazon DMS doesn't support change processing to set column nullability (using the
ALTER COLUMN [SET|DROP] NOT NULL
clause withALTER TABLE
statements). -
With SQL Server 2012 and SQL Server 2014, when using DMS replication with Availability Groups, the distribution database can't be placed in an availability group. SQL 2016 supports placing the distribution database into an availability group, except for distribution databases used in merge, bidirectional, or peer-to-peer replication topologies.
-
For partitioned tables, Amazon DMS doesn't support different data compression settings for each partition.
-
When inserting a value into SQL Server spatial data types (GEOGRAPHY and GEOMETRY), you can either ignore the spatial reference system identifier (SRID) property or specify a different number. When replicating tables with spatial data types, Amazon DMS replaces the SRID with the default SRID (0 for GEOMETRY and 4326 for GEOGRAPHY).
-
If your database isn't configured for MS-REPLICATION or MS-CDC, you can still capture tables that do not have a Primary Key, but only INSERT/DELETE DML events are captured. UPDATE and TRUNCATE TABLE events are ignored.
-
Columnstore indexes aren't supported.
-
Memory-optimized tables (using In-Memory OLTP) aren't supported.
-
When replicating a table with a primary key that consists of multiple columns, updating the primary key columns during full load isn't supported.
-
Delayed durability isn't supported.
-
The
readBackupOnly=Y
endpoint setting (extra connection attribute) doesn't work on RDS for SQL Server source instances because of the way RDS performs backups. -
EXCLUSIVE_AUTOMATIC_TRUNCATION
doesn’t work on Amazon RDS SQL Server source instances because RDS users don't have access to run the SQL Server stored procedure,sp_repldone
. Amazon DMS doesn't capture truncate commands.
-
Amazon DMS doesn't support replication from databases with accelerated database recovery (ADR) turned on.
-
Amazon DMS doesn't support capturing data definition language (DDL) and data manipulation language (DML) statements within a single transaction.
-
Amazon DMS doesn't support the replication of data-tier application packages (DACPAC).
-
UPDATE statements that involve primary keys or unique indexes and update multiple data rows, can cause conflicts when you apply changes to the target database. This might happen, for example, when the target database applies updates as INSERT and DELETE statements instead of a single UPDATE statement. With the batch optimized apply mode, the table might be ignored. With the transactional apply mode, the UPDATE operation might result in constraint violations. To avoid this issue, reload the relevant table. Alternatively, locate the problematic records in the Apply Exceptions control table (
dmslogs.awsdms_apply_exceptions
) and edit them manually in the target database. For more information, see Change processing tuning settings. -
Amazon DMS doesn't support the replication of tables and schemas, where the name includes a special character from the following set.
\\ -- \n \" \b \r ' \t ;
-
Data masking isn't supported. Amazon DMS migrates masked data without masking.
-
Amazon DMS replicates up to 32,767 tables with primary keys and up to 1,000 columns for each table. This is because Amazon DMS creates a SQL Server replication article for each replicated table, and SQL Server replication articles have these limitations.
-
When using Change Data Capture (CDC), you must define all columns that make up a unique index as
NOT NULL
. If this requirement is not met, SQL Server system error 22838 will result. You may lose events if SQL Server archives from the active transaction log to the backup log, or truncates them from the active transaction log.
The following limitations apply when accessing the backup transaction logs:
-
Encrypted backups aren't supported.
-
Backups stored at a URL or on Windows Azure aren't supported.
-
Amazon DMS doesn't support direct processing of transaction log backups at the file level from alternative shared folders.
For Cloud SQL Server sources other than Amazon RDS for Microsoft SQL Server, Amazon DMS supports ongoing replication (CDC) with the active transaction log only. You can't use the backup log with CDC. You may lose events if SQL server archives them from the active transaction log to the backup log, or truncates them from the active transaction log before DMS can read it.
For Amazon RDS for Microsoft SQL Server sources, Amazon DMS 3.5.2 and below supports ongoing replication (CDC) with the active transaction log only, because DMS can’t access the backup log with CDC. You may lose events if RDS for SQL Server archives them from the active transaction log to the backup log, or truncate them from the active transaction log before DMS can read it. This limitation doesn’t apply to Amazon DMS version 3.5.3 and above.
Permissions for SQL Server tasks
Permissions for full load only tasks
The following permissions are required to perform full load only tasks. Note that Amazon DMS does not create the
dms_user
login. For information about creating a login for SQL Server, see Creating a database user with Microsoft SQL Server.
USE db_name; CREATE USER dms_user FOR LOGIN dms_user; ALTER ROLE [db_datareader] ADD MEMBER dms_user; GRANT VIEW DATABASE STATE to dms_user; GRANT VIEW DEFINITION to dms_user; USE master; GRANT VIEW SERVER STATE TO dms_user;
Permissions for tasks with ongoing replication
Self-managed SQL Server instances can be configured for ongoing replication using DMS with or without
using the sysadmin
role. For SQL Server instances, where you can't grant the sysadmin
role,
ensure that the DMS user has the privileges described as follows.
Set up permissions for ongoing replication from a self-managed SQL Server database
Create a new SQL Server account with password authentication using SQL Server Management Studio (SSMS) or as described previously in Permissions for full load only tasks, for example,
self_managed_user
.Run the following
GRANT
commands:GRANT VIEW SERVER STATE TO
self_managed_user
; USE MSDB; GRANT SELECT ON MSDB.DBO.BACKUPSET TOself_managed_user
; GRANT SELECT ON MSDB.DBO.BACKUPMEDIAFAMILY TOself_managed_user
; GRANT SELECT ON MSDB.DBO.BACKUPFILE TOself_managed_user
; USE db_name; CREATE USERself_managed_user
FOR LOGINself_managed_user
; ALTER ROLE [db_owner] ADD MEMBERself_managed_user
; GRANT VIEW DEFINITION toself_managed_user
;In addition to the preceding permissions, the user needs one of the following:
The user must be a member of the
sysadmin
fixed server roleConfigurations and permissions as described in Setting up ongoing replication on a SQL Server in an availability group environment: Without sysadmin role or Setting up ongoing replication on a standalone SQL Server: Without sysadmin role, depending on your source configuration.
Set up permissions for ongoing replication from a cloud SQL Server database
A cloud-hosted SQL server instance is an instance running on Amazon RDS for Microsoft SQL Server, an Azure SQL Managed Instance, or any other managed cloud SQL Server instance supported by DMS.
Create a new SQL Server account with password authentication using SQL Server Management Studio (SSMS) or
as described previously in Permissions for full load only
tasks, for example, rds_user
.
Run the following grant commands.
GRANT VIEW SERVER STATE TO rds_user; USE MSDB; GRANT SELECT ON MSDB.DBO.BACKUPSET TO rds_user; GRANT SELECT ON MSDB.DBO.BACKUPMEDIAFAMILY TO rds_user; GRANT SELECT ON MSDB.DBO.BACKUPFILE TO rds_user; USE db_name; CREATE USER rds_user FOR LOGIN rds_user; ALTER ROLE [db_owner] ADD MEMBER rds_user; GRANT VIEW DEFINITION to rds_user;
For Amazon RDS for Microsoft SQL Server sources, DMS version 3.5.3 and above support reading from transaction
log backups. To ensure that DMS is able to access the log backups, in addition to the above,
either grant master
user privileges, or the following privileges on an RDS SQL Server source:
//DMS 3.5.3 version onwards GRANT EXEC ON msdb.dbo.rds_dms_tlog_download TO rds_user; GRANT EXEC ON msdb.dbo.rds_dms_tlog_read TO rds_user; GRANT EXEC ON msdb.dbo.rds_dms_tlog_list_current_lsn TO rds_user; GRANT EXEC ON msdb.dbo.rds_task_status TO rds_user;
Prerequisites for using ongoing replication (CDC) from a SQL Server source
You can use ongoing replication (change data capture, or CDC) for a self-managed SQL Server database on-premises or on Amazon EC2, or a cloud database such as Amazon RDS or a Microsoft Azure SQL managed instance.
The following requirements apply specifically when using ongoing replication with a SQL Server database as a source for Amazon DMS:
-
SQL Server must be configured for full backups, and you must perform a backup before beginning to replicate data.
-
The recovery model must be set to Bulk logged or Full.
-
SQL Server backup to multiple disks isn't supported. If the backup is defined to write the database backup to multiple files over different disks, Amazon DMS can't read the data and the Amazon DMS task fails.
-
For self-managed SQL Server sources, SQL Server Replication Publisher definitions for the source used in a DMS CDC task aren't removed when you remove the task. A SQL Server system administrator must delete these definitions from SQL Server for self-managed sources.
-
During CDC, Amazon DMS needs to look up SQL Server transaction log backups to read changes. Amazon DMS doesn't support SQL Server transaction log backups created using third-party backup software that aren't in native format. To support transaction log backups that are in native format and created using third-party backup software, add the
use3rdPartyBackupDevice=Y
connection attribute to the source endpoint. -
For self-managed SQL Server sources, be aware that SQL Server doesn't capture changes on newly created tables until they've been published. When tables are added to a SQL Server source, Amazon DMS manages creating the publication. However, this process might take several minutes. Operations made to newly created tables during this delay aren't captured or replicated to the target.
-
Amazon DMS change data capture requires full transaction logging to be turned on in SQL Server. To turn on full transaction logging in SQL Server, either enable MS-REPLICATION or CHANGE DATA CAPTURE (CDC).
-
SQL Server tlog entries won't be marked for re-use until the MS CDC capture job processes those changes.
-
CDC operations aren't supported on memory-optimized tables. This limitation applies to SQL Server 2014 (when the feature was first introduced) and higher.
Amazon DMS change data capture requires a distribution database by default on Amazon EC2 or On-Prem SQL server as source. So, ensure that you have activated the distributor while configuring MS replication for tables with primary keys.
Supported compression methods for SQL Server
Note the following about support for SQL Server compression methods in Amazon DMS:
Amazon DMS supports Row/Page compression in SQL Server version 2008 and later.
Amazon DMS doesn't support the Vardecimal storage format.
Amazon DMS doesn't support sparse columns and columnar structure compression.
Working with self-managed SQL Server AlwaysOn availability groups
SQL Server Always On availability groups provide high availability and disaster recovery as an enterprise-level alternative to database mirroring.
In Amazon DMS, you can migrate changes from a single primary or secondary availability group replica.
Working with the primary availability group replica
To use the primary availability group as a source in Amazon DMS, do the following:
Turn on the distribution option for all SQL Server instances in your availability replicas. For more information, see Setting up ongoing replication on a self-managed SQL Server.
In the Amazon DMS console, open the SQL Server source database settings. For Server Name, specify the Domain Name Service (DNS) name or IP address that was configured for your availability group listener.
When you start an Amazon DMS task for the first time, it might take longer than usual to start. This slowness occurs because the creation of the table articles is being duplicated by the availability group server.
Working with a secondary availability group replica
To use a secondary availability group as a source in Amazon DMS, do the following:
-
Use the same credentials for connecting to individual replicas as those used by the Amazon DMS source endpoint user.
-
Ensure that your Amazon DMS replication instance can resolve DNS names for all existing replicas, and connect to them. You can use the following SQL query to get DNS names for all of your replicas.
select ar.replica_server_name, ar.endpoint_url from sys.availability_replicas ar JOIN sys.availability_databases_cluster adc ON adc.group_id = ar.group_id AND adc.database_name = '<source_database_name>';
When you create the source endpoint, specify the DNS name of the availability group listener for the endpoint's Server name or for the endpoint secret's Server address. For more information about availability group listeners, see What is an availability group listener?
in the SQL Server documentation. You can use either a public DNS server or an on-premises DNS server to resolve the availability group listener, the primary replica, and the secondary replicas. To use an on-premises DNS server, configure the Amazon Route 53 Resolver. For more information, see Using your own on-premises name server.
Add the following extra connection attributes to your source endpoint.
Extra connection attribute Value Notes applicationIntent
ReadOnly
Without this ODBC setting, the replication task is routed to the primary availability group replica. For more information, see SQL Server Native Client Support for High Availability, Disaster Recovery in the SQL Server documentation. multiSubnetFailover
yes
For more information, see SQL Server Native Client Support for High Availability, Disaster Recovery in the SQL Server documentation. alwaysOnSharedSynchedBackupIsEnabled
false
For more information, see Endpoint settings when using SQL Server as a source for Amazon DMS. activateSafeguard
false
For more information, see Limitations following. setUpMsCdcForTables
false
For more information, see Limitations following. Enable the distribution option on all replicas in your availability group. Add all nodes to the distributors list. For more information, see To set up distribution.
Run the following query on the primary read-write replica to enable publication of your database. You run this query only once for your database.
sp_replicationdboption @dbname = N'<source DB name>', @optname = N'publish', @value = N'true';
Limitations
Following are limitations for working with a secondary availability group replica:
Amazon DMS doesn't support Safeguard when using a read-only availability group replica as a source. For more information, see Endpoint settings when using SQL Server as a source for Amazon DMS.
Amazon DMS doesn't support the
setUpMsCdcForTables
extra connection attribute when using a read-only availability group replica as a source. For more information, see Endpoint settings when using SQL Server as a source for Amazon DMS.-
Amazon DMS can use a self-managed secondary availability group replica as a source database for ongoing replication (change data capture, or CDC) starting from version 3.4.7. Cloud SQL Server Multi-AZ read replicas are not supported. If you use previous versions of Amazon DMS, make sure that you use the primary availability group replica as a source database for CDC.
Failover to other nodes
If you set the ApplicationIntent
extra connection attribute
for your endpoint to ReadOnly
, your Amazon DMS task connects to the
read-only node with the highest read-only routing priority. It then fails over
to other read-only nodes in your availability group when the highest priority
read-only node is unavailable. If you don't set ApplicationIntent
,
your Amazon DMS task only connects to the primary (read/write) node in your
availability group.
Endpoint settings when using SQL Server as a source for Amazon DMS
You can use endpoint settings to configure your SQL Server source database similar to using
extra connection attributes. You specify the settings when you create the source
endpoint using the Amazon DMS console, or by using the create-endpoint
command in the
Amazon CLI--microsoft-sql-server-settings '{"
JSON syntax.EndpointSetting"
:
"value"
, ...
}'
The following table shows the endpoint settings that you can use with SQL Server as a source.
Name | Description |
---|---|
|
This attribute turns Safeguard on or off. For information
about Safeguard, see Default value:
Valid values: { Example: |
AlwaysOnSharedSynchedBackupIsEnabled |
This attribute adjusts the behavior of Amazon DMS when migrating from an SQL Server source database that is hosted as part of an Always On availability group cluster. Amazon DMS has enhanced support for SQL Server source databases that are configured to run in an Always On cluster. In this case, Amazon DMS attempts to track if transaction backups are happening from nodes in the Always On cluster other than the node where the source database instance is hosted. At migration task start-up, Amazon DMS tries to connect to each node in the cluster, but fails if it can't connect to any one of the nodes. If you need Amazon DMS to poll all the nodes in the Always On
cluster for transaction backups, set this attribute to
Default value: Valid values: Example: |
|
This ODBC driver attribute setting causes SQL Server to route your replication task to the highest priority read-only node. Without this setting, SQL Server routes your replication task to the primary read-write node. |
|
Use this endpoint setting when you are setting up ongoing replication on a standalone SQL server without a sysadmin user. This parameter is supported on Amazon DMS version 3.4.7 and higher. For information about setting up ongoing replication on a standalone SQL server, see Capturing data changes for ongoing replication from SQL Server. Default value:
Valid values: Example: |
|
Use this extra connection attribute (ECA) to set the client statement timeout for the SQL Server instance, in seconds. The default value is 60 seconds. Example: |
|
When set to Default value: Valid values: Example: |
|
Forces LOB lookup on inline LOB. Default value:
Valid values: Example: |
|
This ODBC driver attribute helps DMS to connect to the new primary in case of an Availability Group failover. This attribute is designed for situations when the connection is broken or the listener IP address is incorrect. In these situations, Amazon DMS attempts to connect to all IP addresses associated with the Availability Group listener. |
|
Use of this attribute requires sysadmin privileges. When this attribute is set
to Valid values: Example: Note: This parameter doesn't work on Amazon RDS SQL Server source instances because of the way RDS performs backups. |
|
For optimal performance, Amazon DMS tries to capture all unread changes from the active transaction log (TLOG). However, sometimes due to truncation, the active TLOG might not contain all the unread changes. When this occurs, Amazon DMS accesses the log backup to capture the missing changes. To minimize the need to access the log backup, Amazon DMS prevents truncation using one of the following methods:
Default value:
Valid values: { Example: |
|
This attribute turns on MS-CDC for the source database and for tables in
the task mapping that don't have MS-Replication enabled. Setting this value to
Valid values: { Example: |
|
Indicates the mode used to fetch CDC data. Default value:
Valid values: Example: |
|
When this attribute is set to |
Source data types for SQL Server
Data migration that uses SQL Server as a source for Amazon DMS supports most SQL Server data types. The following table shows the SQL Server source data types that are supported when using Amazon DMS and the default mapping from Amazon DMS data types.
For information on how to view the data type that is mapped in the target, see the section for the target endpoint you are using.
For additional information about Amazon DMS data types, see Data types for Amazon Database Migration Service.
SQL Server data types |
Amazon DMS data types |
---|---|
BIGINT |
INT8 |
BIT |
BOOLEAN |
DECIMAL |
NUMERIC |
INT |
INT4 |
MONEY |
NUMERIC |
NUMERIC (p,s) |
NUMERIC |
SMALLINT |
INT2 |
SMALLMONEY |
NUMERIC |
TINYINT |
UINT1 |
REAL |
REAL4 |
FLOAT |
REAL8 |
DATETIME |
DATETIME |
DATETIME2 (SQL Server 2008 and higher) |
DATETIME |
SMALLDATETIME |
DATETIME |
DATE |
DATE |
TIME |
TIME |
DATETIMEOFFSET |
WSTRING |
CHAR |
STRING |
VARCHAR |
STRING |
VARCHAR (max) |
CLOB TEXT To use this data type with Amazon DMS, you must enable the use of CLOB data types for a specific task. For SQL Server tables, Amazon DMS updates LOB columns in the target even for UPDATE statements that don't change the value of the LOB column in SQL Server. During CDC, Amazon DMS supports CLOB data types only in tables that include a primary key. |
NCHAR |
WSTRING |
NVARCHAR (length) |
WSTRING |
NVARCHAR (max) |
NCLOB NTEXT To use this data type with Amazon DMS, you must enable the use of SupportLobs for a specific task. For more information about enabling Lob support, see Setting LOB support for source databases in an Amazon DMS task. For SQL Server tables, Amazon DMS updates LOB columns in the target even for UPDATE statements that don't change the value of the LOB column in SQL Server. During CDC, Amazon DMS supports CLOB data types only in tables that include a primary key. |
BINARY |
BYTES |
VARBINARY |
BYTES |
VARBINARY (max) |
BLOB IMAGE For SQL Server tables, Amazon DMS updates LOB columns in the target even for UPDATE statements that don't change the value of the LOB column in SQL Server. To use this data type with Amazon DMS, you must enable the use of BLOB data types for a specific task. Amazon DMS supports BLOB data types only in tables that include a primary key. |
TIMESTAMP |
BYTES |
UNIQUEIDENTIFIER |
STRING |
HIERARCHYID |
Use HIERARCHYID when replicating to a SQL Server target endpoint. Use WSTRING (250) when replicating to all other target endpoints. |
XML |
NCLOB For SQL Server tables, Amazon DMS updates LOB columns in the target even for UPDATE statements that don't change the value of the LOB column in SQL Server. To use this data type with Amazon DMS, you must enable the use of NCLOB data types for a specific task. During CDC, Amazon DMS supports NCLOB data types only in tables that include a primary key. |
GEOMETRY |
Use GEOMETRY when replicating to target endpoints that support this data type. Use CLOB when replicating to target endpoints that don't support this data type. |
GEOGRAPHY |
Use GEOGRAPHY when replicating to target endpoints that support this data type. Use CLOB when replicating to target endpoints that don't support this data type. |
Amazon DMS doesn't support tables that include fields with the following data types.
-
CURSOR
-
SQL_VARIANT
-
TABLE
Note
User-defined data types are supported according to their base type. For example, a user-defined data type based on DATETIME is handled as a DATETIME data type.