Deploying your Amazon DataSync agent - Amazon DataSync
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Deploying your Amazon DataSync agent

The first step to creating your Amazon DataSync agent is deploying the agent in your storage environment. You can deploy an agent as a virtual machine (VM) on VMware ESXi, Linux Kernel-based Virtual Machine (KVM), and Microsoft Hyper-V hypervisors. You also can deploy an agent as an Amazon EC2 instance in a virtual private cloud (VPC) within Amazon.

Tip

We recommend using separate agents for DataSync Discovery and DataSync transfers.

Deploying your agent on VMware

You can download an agent from the DataSync console and deploy it in your VMware environment.

Before you begin: Make sure that your storage environment can support a DataSync agent. For more information, see Virtual machine requirements.

To deploy an agent on VMware
  1. Open the Amazon DataSync console at https://console.amazonaws.cn/datasync/.

  2. In the left navigation pane, choose Agents, and then choose Create agent.

  3. For Hypervisor, choose VMWare ESXi, and then choose Download the image.

    The agent downloads in a .zip file that contains an .ova image file.

  4. To minimize network latency, deploy the agent as close as possible to the storage system that DataSync needs to access (the same local network if possible). For more information, see Network requirements for on-premises, other cloud, and edge storage.

    If needed, see your hypervisor's documentation on how to deploy an .ova file in a VMware host.

  5. Power on your hypervisor, log in to the agent VM, and get the agent's IP address. You need this IP address to activate the agent.

    The agent VM's default credentials are login admin and password password. If needed, change the password through the VM's local console.

Deploying your agent on KVM

You can download an agent from the DataSync console and deploy it in your KVM environment.

Before you begin: Make sure that your storage environment can support a DataSync agent. For more information, see Virtual machine requirements.

To deploy an agent on KVM
  1. Open the Amazon DataSync console at https://console.amazonaws.cn/datasync/.

  2. In the left navigation pane, choose Agents, and then choose Create agent.

  3. For Hypervisor, choose Kernel-based Virtual Machine (KVM), and then choose Download the image.

    The agent downloads in a .zip file that contains a .qcow2 image file.

  4. To minimize network latency, deploy the agent as close as possible to the storage system that DataSync needs to access (the same local network if possible). For more information, see Network requirements for on-premises, other cloud, and edge storage.

  5. Run the following command to install your .qcow2 image.

    virt-install \ --name "datasync" \ --description "DataSync agent" \ --os-type=generic \ --ram=32768 \ --vcpus=4 \ --disk path=datasync-yyyymmdd-x86_64.qcow2,bus=virtio,size=80 \ --network default,model=virtio \ --graphics none \ --import

    For information about how to manage this VM and your KVM host, see your hypervisor's documentation.

  6. Power on your hypervisor, log in to your VM, and get the IP address of the agent. You need this IP address to activate the agent.

    The agent VM's default credentials are login admin and password password. If needed, change the password through the VM's local console.

Deploying your agent on Microsoft Hyper-V

You can download an agent from the DataSync console and deploy it in your Microsoft Hyper-V environment.

Before you begin: Make sure that your storage environment can support a DataSync agent. For more information, see Virtual machine requirements.

To deploy an agent on Hyper-V
  1. Open the Amazon DataSync console at https://console.amazonaws.cn/datasync/.

  2. In the left navigation pane, choose Agents, and then choose Create agent.

  3. For Hypervisor, choose Microsoft Hyper-V, and then choose Download the image.

    The agent downloads in a .zip file that contains a .vhdx image file.

  4. To minimize network latency, deploy the agent as close as possible to the storage system that DataSync needs to access (the same local network if possible). For more information, see Network requirements for on-premises, other cloud, and edge storage.

    If needed, see your hypervisor's documentation on how to deploy a .vhdx file in a Hyper-V host.

    Warning

    You may notice poor network performance if you enable virtual machine queue (VMQ) on a Hyper-V host that's using a Broadcom network adapter. For information about a workaround, see the Microsoft documentation.

  5. Power on your hypervisor, log in to your VM, and get the IP address of the agent. You need this IP address to activate the agent.

    The agent VM's default credentials are login admin and password password. If needed, change the password through the VM's local console.

Deploying your Amazon EC2 agent

You might deploy a DataSync agent as an Amazon EC2 instance when transferring data between:

  • A self-managed cloud storage system (for example, an NFS file server in Amazon) and an Amazon storage service.

  • A cloud storage provider (such as Microsoft Azure Blob Storage ) and an Amazon storage service.

  • Amazon S3 on Amazon Outposts and an Amazon storage service.

Warning

We don't recommend using an Amazon EC2 agent with on-premises storage because of increased network latency. Instead, deploy the agent as a VMware, KVM, or Hyper-V virtual machine in your data center as close to your on-premises storage as possible.

To choose the agent AMI for your Amazon Web Services Region
  • Use the following Amazon CLI command to get the latest DataSync Amazon Machine Image (AMI) ID for your Amazon Web Services Region.

    aws ssm get-parameter --name /aws/service/datasync/ami --region region
    Example command and output
    aws ssm get-parameter --name /aws/service/datasync/ami --region us-east-1 { "Parameter": { "Name": "/aws/service/datasync/ami", "Type": "String", "Value": "ami-id", "Version": 6, "LastModifiedDate": 1569946277.996, "ARN": "arn:aws-cn:ssm:us-east-1::parameter/aws/service/datasync/ami" } }
To deploy your DataSync agent as an Amazon EC2 instance
Important

To avoid charges, deploy your agent in a way that it doesn't require network traffic between Availability Zones. For example, deploy your agent in the Availability Zone where your self-managed file system resides.

To learn more about data transfer prices for all Amazon Web Services Regions, see Amazon EC2 On-Demand pricing.

  1. From the Amazon Web Services account where the source file system resides, launch the agent by using your AMI from the Amazon EC2 launch wizard. Use the following URL to launch the AMI.

    https://console.amazonaws.cn/ec2/v2/home?region=source-file-system-region#LaunchInstanceWizard:ami=ami-id

    In the URL, replace the source-file-system-region and ami-id with your own source Amazon Web Services Region and AMI ID.

  2. For Instance type, choose one of the recommended Amazon EC2 instances for DataSync.

  3. For Network settings, choose Edit and then do the following:

    1. For VPC, choose the virtual private cloud (VPC) where the storage system you're transferring data to or from is located.

    2. For Auto-assign public IP, choose whether you want your agent to be accessible from the public internet.

      You use the instance's public or private IP address later to activate your agent.

    3. For Firewall (security groups), create or a select a security group that does the following:

      Note

      You will need to configure additional ports depending on the type service endpoint that you use to connect the agent with Amazon.

  4. (Recommended) To increase performance when transferring from a cloud-based file system, expand Advanced details choose a Placement group value where your storage resides.

  5. Choose Launch to launch your instance.

  6. Once your instance status is Running, choose the instance.

  7. If you configured your instance to be accessible from the public internet, make note of the instance's public IP address. If you didn't, make note of the private IP address.

    You need this IP address when activating your agent.

The following guidance can help with common scenarios if you deploy an DataSync agent in an Amazon Web Services Region.

Deploying your agent for transfers between cloud file systems and Amazon S3

To transfer data between Amazon Web Services accounts, or from a cloud file system, the DataSync agent must be located in the same Amazon Web Services Region and Amazon Web Services account where the source file system resides. This type of transfer includes the following:

  • Transfers between Amazon EFS or FSx for Windows File Server file systems to Amazon storage in a different Amazon Web Services account.

  • Transfers from self-managed file systems to Amazon storage services.

Important

Deploy your agent such that it doesn't require network traffic between Availability Zones (to avoid charges for such traffic).

  • To access your Amazon EFS or FSx for Windows File Server file system, deploy the agent in an Availability Zone that has a mount target to your file system.

  • For self-managed file systems, deploy the agent in the Availability Zone where your file system resides.

To learn more about data transfer prices for all Amazon Web Services Regions, see Amazon EC2 On-Demand pricing.

For example, the following diagram shows a high-level view of the DataSync architecture for transferring data from in-cloud Network File trr System (NFS) to in-cloud NFS or Amazon S3.

Diagram showing data transfer between source Region containing a virtual private cloud (VPC) with an EFS file system and DataSync agent, and a destination Region with a DataSync endpoint and EFS file system.

Remember the following when transferring between Amazon storage services across Amazon Web Services accounts:

  • When you're copying between Amazon EFS file systems, we recommend that you configure your source as an NFS location to EFS (destination) transfer.

  • When you're transferring between Amazon FSx file systems, we recommend that you use the Server Message Block (SMB) (source) to Amazon FSx (destination) transfer.

Deploying your agent for transfers between Amazon S3 to Amazon file systems

The following diagram provides a high-level view of the DataSync architecture for transferring data from Amazon S3 to an Amazon file system, such as Amazon EFS or Amazon FSx. You can use this architecture to transfer data from one Amazon Web Services account to another, or to transfer data from Amazon S3 to a self-managed in-cloud file system.

Diagram showing data transfer between source Region containing an S3 bucket and DataSync endpoint, and a destination Region containing a VPC with an EFS file system and DataSync agent.

Deploying your agent on Amazon Snowball Edge

For more information and instructions, see Creating a DataSync agent in your on-premises storage environment for Amazon S3 compatible storage.

Deploying your agent on Amazon Snowcone

The DataSync agent AMI is pre-installed on your Snowcone device. Launch the agent with one of the following tools:

Once your agent is deployed and activated, you can configure your transfer location.

Deploying your agent on Amazon Outposts

You can launch a DataSync Amazon EC2 instance on your Outpost. To learn more about launching an AMI on Amazon Outposts, see Launch an instance on your Outpost in the Amazon Outposts User Guide.

When using DataSync to access Amazon S3 on Outposts, you must launch the agent in a VPC that's allowed to access your Amazon S3 access point, and activate the agent in the parent Region of the Outpost. The agent must also be able to route to the Amazon S3 on Outposts endpoint for the bucket. To learn more about working with Amazon S3 on Outposts endpoints, see Working with Amazon S3 on Outposts in the Amazon S3 User Guide.