Planning your large transfer - Amazon Snowball Edge Developer Guide
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Planning your large transfer

We recommend that you plan and calibrate large data transfers between the Amazon Snowball Edge devices that you have on site and your servers using the guidelines in the following sections.

Step 1: Understand what you're moving to the cloud

Before you create your first job using the Amazon Snow Family Management Console, ensure that you assess the volume of data you need to transfer, where it is currently stored, and the destination that you want to transfer it to. For data transfers that are a petabyte in scale or larger, this administrative housekeeping makes it much easier when your Snow Family devices arrive.

If you're migrating data into the Amazon Web Services Cloud for the first time, we recommend that you design a cloud migration model. Cloud migration doesn’t happen overnight. It requires a careful planning process to ensure that all systems work as expected.

When you're done with this step, you should know the total amount of data that you're going to move into the cloud.

Step 2: Calculate your target transfer rate

It's important to estimate how quickly you can transfer data to the Snow Family devices that are connected to each of your servers. This estimated speed in MB/Sec determines how fast you can transfer the data from your data source to Snowball Edge devices using your local network infrastructure.

Note

For large data transfers, we recommend using the Amazon S3 data transfer method. You must select this option when the you order devices in the Amazon Snow Family Management Console.

To determine a baseline transfer rate, transfer a small subset of your data to the Snowball Edge device, or transfer a 10 GB sample file and observe the throughput.

While determining your target transfer speed, keep in mind that you can improve the throughput by tuning your environment, including network configuration, by changing the network speed, the size of the files being transferred, and the speed at which data can be read from your local servers. The Amazon S3 adapter copies data to Snow Family devices as quickly as your conditions allow.

Step 3: Determine how many Snow Family devices you need

Using the total amount of data that you plan to move into the cloud, the estimated transfer speed, and the number of days that you want to allow to move the data into Amazon, determine how many Snow Family devices you need for your large-scale data migration. Depending on the device type, Snowball Edge devices have approximately 39.5 TB, 80 TB, or 210 TB of usable storage space. For example, if you want to move 300 TB of data to Amazon over 10 days and you have a transfer speed of 250 MB/s, you need 4 Snowball Edge devices. With less than 40 TB of data remaining to transfer, Amazon Snowcone devices (with 14TB of usable space) will be recommended.

Note

The Amazon Snow Family devices LDMM provides a wizard to estimate the number of Amazon Snow Family devices that can be supported concurrently. For more information, see Creating a large data migration plan.

Step 4: Create your jobs

After you know how many Snow Family devices you need, you need to create an import job for each device. Creation of multiple jobs are simplified by the Snow Family LDMM. For more information, see Placing your next job order.

Note

You can place your next job order and automatically add it to your plan directly from the Recommended job ordering schedule. For more information, see Recommended job ordering schedule.

Step 5: Separate your data into transfer segments

As a best practice for large data transfers involving multiple jobs, we recommend that you logically split your data into a number of smaller, more manageable data sets. This allows you to transfer each partition at a time, or multiple partitions in parallel. When planning your partitions, make sure that the data for the partitions combined fit on the Snow Family devices for the job. For example, you can separate your transfer into partitions in any of the following ways:

  • You can create 10 partitions of 8 TB each for a Snowball Edge.

  • For large files, each file can be an individual partition up to the 5 TB size limit for objects in Amazon S3.

  • Each partition can be a different size, and each individual partition can be made up of the same kind of data—for example, small files in one partition, compressed archives in another, large files in another partition, and so on. This approach can help you to determine your average transfer rate for different types of files.

Note

Metadata operations are performed for each file that's transferred. Regardless of a file's size, this overhead remains the same. Therefore, you get faster performance by compressing small files into a larger bundle, batching your files, or transferring larger individual files.

Creating data transfer segments can make it easier for you to quickly resolve transfer issues because trying to troubleshoot a large, heterogeneous transfer after the transfer runs for a day or more can be complex.

When you've finished planning your petabyte-scale data transfer, we recommend that you transfer a few segments onto the Snow Family device from your server to calibrate your speed and total transfer time.