Before You Order a Snowball Edge device - Amazon Snowball Edge Developer Guide
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China.

Before You Order a Snowball Edge device

Amazon Snowball Edge is a region-specific service. So before you plan your job, be sure that the service is available in your region. Ensure that your location and Amazon S3 bucket are within the same Amazon Web Services Region or the same country because it will impact your ability to order the device.

As part of the order process, you create an Amazon Identity and Access Management (IAM) role and an Amazon Key Management Service (Amazon KMS) key. The KMS key is used for encrypting the data during transit and at rest on the Snowball Edge device. For more information about creating IAM roles and KMS keys, see Creating an Amazon Snowball Edge Job.

Questions about the Local Environment

Understanding your dataset and how the local environment is set up will help you complete your data transfer. Consider the following before placing your order.

What data are you transferring?

Transferring a large number of small files does not work well with Amazon Snowball Edge. This is because Snowball Edge encrypts each individual object. Small files include files under 1 MB in size. We recommend that you zip them up before transferring them onto the Amazon Snowball Edge device. We also recommend that you have no more than 500,000 files or directories within each directory.

Will the data be accessed during the transfer?

It is important to have a static dataset, (that is, no users or systems are accessing the data during transfer). If not, the file transfer can fail due to a checksum mismatch. The files won't be transferred and the files will be marked as Failed.

We recommend that if you are using the file interface, you only use one method of transferring data to the Amazon Snowball Edge. Copying data with both the file interface and the Amazon S3 interface can result in read/write conflicts.

To prevent corrupting your data, don't disconnect an Amazon Snowball Edge device or change its network settings while transferring data. Files should be in a static state while being written to the device. Files that are modified while they are being written to the device can result in read/write conflicts.

Will the network support Amazon Snowball data transfer?

Snowball Edge supports the RJ45, SFP+, or QSFP+ networking adapters. Verify that your switch is a gigabit switch. Depending on the brand of switch, it might say gigabit or 10/100/1000. Snowball Edge devices do not support a megabit switch, or 10/100 switch.

Working with Files That Contain Special Characters

It's important to note that if your objects contain special characters, you might encounter errors. Although Amazon S3 allows special characters, we highly recommend that you avoid the following characters:

  • Backslash ("\")

  • Left curly brace ("{")

  • Right curly brace ("}")

  • Left square bracket ("[")

  • Right square bracket ("]")

  • 'Less Than' symbol ("<")

  • 'Greater Than' symbol (">")

  • Non-printable ASCII characters (128–255 decimal characters)

  • Caret ("^")

  • Percent character ("%")

  • Grave accent / back tick ("`")

  • Quotation marks

  • Tilde ("~")

  • 'Pound' character ("#")

  • Vertical bar / pipe ("|")

If your files have one or more of these characters, rename them before you copy them to the Amazon Snowball Edge device. Windows users who have spaces in their file names should be careful when copying individual objects or running a recursive command. Surround individual objects that have spacing in the name with quotation marks. The following are examples of such files.

Operating system File name: test file.txt

Windows

“C:\Users\<username>\desktop\test file.txt”

iOS

/Users/<username>/test\ file.txt

Linux

/home/<username>/test\ file.txt

Note

The only object metadata that is transferred is the object name and size. If you want additional metadata to be copied, you can use the file interface or other tools to copy the data to Amazon S3.

Using Amazon EC2 on Snowball

This section provides an overview of using Amazon EC2 compute instances on an Amazon Snowball Edge device. It includes conceptual information, procedures, and examples.

Note

These Amazon EC2 features on Amazon Snowball are not supported in the Asia Pacific (Mumbai) and Europe (Paris) Amazon Web Services Regions.

You can run Amazon EC2 compute instances hosted on an Amazon Snowball Edge with the sbe1, sbe-c, and sbe-g instance types:

  • The sbe1 instance type works on devices with the Snowball Edge Storage Optimized option.

  • The sbe-c instance type works on devices with the Snowball Edge Compute Optimized option.

  • Both the sbe-c and sbe-g instance types work on devices with the Snowball Edge Compute Optimized with GPU option.

All the compute instance types supported on Snowball Edge device options are unique to Amazon Snowball Edge devices. Like their cloud-based counterparts, these instances require Amazon Machine Images (AMIs) to launch. You choose the AMI for an instance before you create your Snowball Edge job.

To use a compute instance on a Snowball Edge, create a job and specify your AMIs. You can do this using the Amazon Snowball Management Console, the Amazon Command Line Interface (Amazon CLI), or one of the Amazon SDKs. Typically, to use your instances, there are some housekeeping prerequisites that you must perform before creating your job.

After your device arrives, you can start managing your AMIs and instances. You can manage your compute instances on a Snowball Edge through an Amazon EC2-compatible endpoint. This type of endpoint supports many of the Amazon EC2 CLI commands and actions for the Amazon SDKs. You can't use the Amazon Web Services Management Console on the Snowball Edge to manage your AMIs and compute instances.

When you're done with your device, return it to Amazon. If the device was used in an import job, the data transferred using the Amazon S3 interface or the file interface is imported into Amazon S3. Otherwise, we perform a complete erasure of the device when it is returned to Amazon. This erasure follows the National Institute of Standards and Technology (NIST) 800-88 standards.

Important

Data in compute instances running on a Snowball Edge isn't imported into Amazon.

Using Compute Instances on Clusters

You can use compute instances on clusters of Snowball Edge devices. The procedures and guidance for doing so are the same as for using compute instances on a standalone device.

When you create a cluster job with AMIs, a copy of each AMI exists on each node in the cluster. You can have only 10 AMIs associated with a cluster of devices regardless of the number of nodes on the cluster. When you launch an instance in a cluster, you declare the node to host the instance in your command and the instance runs on a single node.

Clusters must be either compute-optimized or storage-optimized. You can have a cluster of compute-optimized nodes, and some number of them can have GPUs. You can have a cluster made entirely of storage-optimized nodes. A cluster can't be made of a combination of compute-optimized nodes and storage-optimized nodes.

Pricing for Compute Instances on Snowball Edge

There are additional costs associated with using compute instances. For more information, see Amazon Snowball Edge Pricing.

Prerequisites

Before creating your job, keep the following information in mind:

Creating a Linux AMI from an Instance

You can create an AMI using the Amazon Web Services Management Console or the command line. Start with an existing AMI, launch an instance, customize it, create a new AMI from it, and finally, launch an instance of your new AMI.

To create an AMI from an instance using the console

  1. Select an appropriate EBS-backed AMI as a starting point for your new AMI, and configure it as needed before launch. For more information, see Launching an instance using the Launch Instance Wizard in the Amazon EC2 User Guide for Linux Instances.

  2. Choose Launch to launch an instance of the EBS-backed AMI that you selected. Accept the default values as you step through the wizard. For more information, see Launching an instance using the Launch Instance Wizard.

  3. While the instance is running, connect to it. You can perform the following actions on your instance to customize it for your needs:

    • Install software and applications.

    • Copy data.

    • Reduce start time by deleting temporary files, defragmenting your hard drive, and zeroing out free space.

    • Attach additional Amazon EBS volumes.

  4. (Optional) Create snapshots of all the volumes attached to your instance. For more information about creating snapshots, see Creating Amazon EBS snapshots in the Amazon EC2 User Guide for Linux Instances.

  5. In the navigation pane, choose Instances, and choose your instance. Choose Actions, choose Image, and then choose Create image.

    Tip

    If this option isn't available, your instance isn't an Amazon EBS-backed instance.

  6. In the Create Image dialog box, specify the following information, and then choose Create image.

    • Image name - A unique name for the image.

    • Image description - An optional description of the image, up to 255 characters.

    • No reboot - This option is not selected by default. Amazon EC2 shuts down the instance, takes snapshots of any attached volumes, creates and registers the AMI, and then reboots the instance. Select No reboot to avoid having your instance shut down.

      Warning

      If you select No reboot, we can't guarantee the file system integrity of the created image.

    • Instance Volumes - The fields in this section enable you to modify the root volume, and add more Amazon EBS and instance store volumes. For information about each field, pause on the i icon next to each field to display field tooltips. Some important points are listed following:

      • To change the size of the root volume, locate Root in the Volume Type column. For Size (GiB), enter the required value.

      • If you select Delete on Termination, when you terminate the instance created from this AMI, the Amazon EBS volume is deleted. If you clear Delete on Termination, when you terminate the instance, the Amazon EBS volume is not deleted. For more information, see Preserving Amazon EBS volumes on instance termination in the Amazon EC2 User Guide for Linux Instances.

      • To add an Amazon EBS volume, choose Add New Volume (which adds a new row). For Volume Type, choose EBS, and fill in the fields in the row. When you launch an instance from your new AMI, additional volumes are automatically attached to the instance. Empty volumes must be formatted and mounted. Volumes based on a snapshot must be mounted.

      • To add an instance store volume, see Adding instance store volumes to an AMI in the Amazon EC2 User Guide for Linux Instances. When you launch an instance from your new AMI, additional volumes are automatically initialized and mounted. These volumes don't contain data from the instance store volumes of the running instance on which you based your AMI.

  7. To view the status of your AMI while it is being created, in the navigation pane, choose AMIs. Initially, the status is pending but should change to available after a few minutes.

    (Optional) To view the snapshot that was created for the new AMI, choose Snapshots. When you launch an instance from this AMI, we use this snapshot to create its root device volume.

  8. Launch an instance from your new AMI. For more information, see Launching an instance using the Launch Instance Wizard in the Amazon EC2 User Guide for Linux Instances.

  9. The new running instance contains all of the customizations that you applied in previous steps.

To Create an AMI from an Instance Using the Command Line

You can use one of the following commands. For more information about these command line interfaces, see Accessing Amazon EC2 in the Amazon EC2 User Guide for Linux Instances.

Creating a Linux AMI from a Snapshot

If you have a snapshot of the root device volume of an instance, you can create an AMI from this snapshot using the Amazon Web Services Management Console or the command line.

To create an AMI from a snapshot using the console

  1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.

  2. In the navigation pane, under Elastic Block Store, choose Snapshots.

  3. Choose the snapshot, choose Actions, and then choose Create image.

  4. In the Create image from EBS snapshot dialog box, complete the fields to create your AMI. Then choose Create. If you're re-creating a parent instance, choose the same options as the parent instance.

    • Architecture – Choose i386 for 32-bit or x86_64 for 64-bit.

    • Root device name – Enter the appropriate name for the root volume. For more information, see Device naming on Linux instances in the Amazon EC2 User Guide for Linux Instances.

    • Virtualization type – Choose whether instances launched from this AMI use paravirtual (PV) or hardware virtual machine (HVM) virtualization. For more information, see Linux AMI virtualization types.

    • (PV virtualization type only) Kernel ID and RAM disk ID – Choose the AKI and ARI from the lists. If you choose the default AKI, or you don't choose an AKI, you must specify an AKI every time you launch an instance using this AMI. In addition, your instance might fail the health checks if the default AKI is incompatible with the instance.

    • (Optional) Block Device Mappings – Add volumes or expand the default size of the root volume for the AMI. For more information about resizing the file system on your instance for a larger volume, see Extending a Linux File system after resizing a volume in the Amazon EC2 User Guide for Linux Instances.

To Create an AMI from a Snapshot Using the Command Line

To create an AMI from a snapshot, you can use one of the following commands. For more information about these command line interfaces, see Accessing Amazon EC2 in the Amazon EC2 User Guide for Linux Instances.

Using Amazon S3 on Snowball

As part of the order process, you are asked to create an Amazon Identity and Access Management (IAM) role and Amazon Key Management Service (Amazon KMS) key. The KMS key is used for encrypting the data during transit and at rest on the Snowball Edge device. For more information about creating IAM roles and KMS keys, see Creating an AmazonAmazon Snowball Edge Job.

How Import Works

Each import job uses a single Snowball Edge device. After you create a job, we ship a Snowball Edge device to you. When it arrives, you connect the Snowball Edge device to your network and transfer the data that you want to import to Amazon S3 onto that Snowball Edge. When you’re done transferring data, ship the Snowball Edge back to Amazon. We then import your data into Amazon S3.

Important

Snowball Edge cannot write to buckets if you have turned on S3 Object Lock. We also cannot write to your bucket if IAM policies on the bucket prevent writing to the bucket.

How Export Works

Each export job can use any number of Amazon Snowball Edge devices. After you create a job, a listing operation starts in Amazon S3. This listing operation splits your job into parts. Each job part has exactly one device associated with it. After your job parts are created, your first job part enters the Preparing Snowball status.

Note

The listing operation to split your job into parts is a function of Amazon S3, and you are billed the same as Amazon S3 operation.

We then start exporting your data onto a device. Typically, exporting data takes one business day. However, this process can take longer. When the export is done, Amazon gets the device ready for your regional carrier to pick up.

When the device arrives at your site, you connect it to your network and transfer the data that you want to import into Amazon S3 onto the device. When you’re done transferring the data, ship the device back to Amazon. When we receive the returned device, we erase it completely. This erasure follows the National Institute of Standards and Technology (NIST) 800-88 standards.

This step marks the completion of that particular job part. If there are more job parts, the next job part now is prepared for shipping.

Important

Snowball Edge is unable to export files that are in S3 Glacier storage class. These objects must be restored before we can export the files. If we encounter files in S3 Glacier storage class, we contact you to let you know, but this might add delays to your export job.

Amazon S3 Encryption with Amazon KMS

You can use the default Amazon managed or customer managed encryption keys to protect your data when importing or exporting data.

Using Amazon S3 Default Bucket Encryption with Amazon KMS Managed Keys

To enable Amazon managed encryption with Amazon KMS

  1. Open the Amazon S3 console at https://console.amazonaws.cn/s3/.

  2. Choose the Amazon S3 bucket that you want to encrypt.

  3. In the wizard that appears on the right side, choose Properties.

  4. In the Default encryption box, choose Disabled (this option is grayed out) to enable default encryption.

  5. Choose Amazon-KMS as the encryption method, and then choose the KMS key that you want to use. This key is used to encrypt objects that are PUT into the bucket.

  6. Choose Save.

After the Snowball Edge job is created, and before the data is imported, add a statement to the existing IAM role policy. This is the role you created during the ordering process. Depending on the job type, the default role name looks similar to Snowball-import-s3-only-role or Snowball-export-s3-only-role.

The following are examples of such a statement.

For importing data

If you use server-side encryption with Amazon KMS managed keys (SSE-KMS) to encrypt the Amazon S3 buckets associated with your import job, you also need to add the following statement to your IAM role.

Example: Snowball import IAM role

{ "Effect": "Allow", "Action": [ "kms: GenerateDataKey", "kms: Decrypt" ], "Resource":"arn:aws:kms:us-west-2:123456789012:key/abc123a1-abcd-1234-efgh-111111111111" }

For exporting data

If you use server-side encryption with Amazon KMS managed keys to encrypt the Amazon S3 buckets associated with your export job, you also must add the following statement to your IAM role.

Example Snowball import IAM role

{ "Effect": "Allow", "Action": [ "kms:Decrypt" ], "Resource":"arn:aws:kms:us-west-2:123456789012:key/abc123a1-abcd-1234-efgh-111111111111" }

Using S3 Default Bucket Encryption with Amazon KMS Customer Keys

You can use the default Amazon S3 bucket encryption with your own KMS keys to protect data you are importing and exporting.

For importing data

To enable customer managed encryption with Amazon KMS

  1. Sign in to the Amazon Web Services Management Console and open the Amazon Key Management Service (Amazon KMS) console at https://console.amazonaws.cn/kms.

  2. To change the Amazon Web Services Region, use the Region selector in the upper-right corner of the page.

  3. In the left navigation pane, choose Customer managed keys, and then choose the KMS key associated with the buckets that you want to use.

  4. Expand Key Policy if it is not already expanded.

  5. In the Key Users section, choose Add and search for the IAM role. Choose the IAM role, and then choose Add.

  6. Alternatively, you can choose Switch to Policy view to display the key policy document and add a statement to the key policy. The following is an example of the policy.

Example: Policy for the Amazon KMS customer managed key

{ "Sid": "Allow use of the key", "Effect": "Allow", "Principal": { "AWS": [ "arn:aws:iam::111122223333:role/snowball-import-s3-only-role" ] }, "Action": [ "kms:Decrypt", "kms:GenerateDataKey" ], "Resource": "*" }

After this policy has been added to the Amazon KMS customer managed key, it is also needed to update the IAM role associated with the Snowball job. By default, the role is snowball-import-s3-only-role.

Example: The Snowball import IAM role

{ "Effect": "Allow", "Action": [ "kms: GenerateDataKey", "kms: Decrypt" ], "Resource": "arn:aws:kms:us-west-2:123456789012:key/abc123a1-abcd-1234-efgh-111111111111" }

For more information, see Using Identity-Based Policies (IAM Policies) for Amazon Snowball.

The KMS key that is being used looks like the following:

“Resource”:“arn:aws:kms:region:AccoundID:key/*”

For exporting data

Example: Policy for the Amazon KMS customer managed key

{ "Sid": "Allow use of the key", "Effect": "Allow", "Principal": { "AWS": [ "arn:aws:iam::111122223333:role/snowball-import-s3-only-role" ] }, "Action": [ "kms:Decrypt", "kms:GenerateDataKey" ], "Resource": "*" }

After this policy has been added to the Amazon KMS customer managed key, it is also needed to update the IAM role associated with the Snowball job. By default, the role looks like the following:

snowball-export-s3-only-role

Example: The Snowball export IAM role

{ "Effect": "Allow", "Action": [ "kms: GenerateDataKey", "kms: Decrypt" ], "Resource": "arn:aws:kms:us-west-2:123456789012:key/abc123a1-abcd-1234-efgh-111111111111" }

After this policy has been added to the Amazon KMS customer managed key, it is also needed to update the IAM role associated with the Snowball job. By default, the role is snowball-export-s3-only-role.

Snowball Edge Clusters

A cluster is a logical grouping of Amazon Snowball Edge devices, in groups of 5–10 devices. A cluster is created with a single job. A cluster offers increased durability and storage capacity. This section provides information about Snowball Edge clusters.

For the Amazon Snowball service, a cluster is a collective of Snowball Edge devices, used as a single logical unit, for local storage and compute purposes.

A cluster offers these primary benefits over a standalone Snowball Edge used for local storage and compute purposes:

  • Increased durability – The data stored in a cluster of Snowball Edge devices has increased data durability. In addition, the data on the cluster remains as safe and viable even during possible Snowball Edge outages in the cluster. Clusters can withstand the loss of two nodes before the data it becomes a concern. You can also add or replace nodes.

  • Increased storage – The total available storage is 45 terabytes of data per node in the cluster. So in a five-node cluster, there are 225 terabytes of available storage space. In contrast, there are about 80 terabytes of available storage space in a standalone Snowball Edge. Clusters that have more than five nodes have even more storage space.

A cluster of Snowball Edge devices is made of leaderless nodes. Any node can write data to and read data from the entire cluster, and all nodes can perform the behind-the-scenes management of the cluster.

Snowball Edge Cluster Quorums

A quorum represents the minimum number of Snowball Edge devices in a cluster that must be communicating with each other to maintain some level of operation. There are two levels of quorum for Snowball Edge clusters—a read/write quorum and a read quorum.

Suppose that you upload your data to a cluster of Snowball Edge devices. With all devices healthy, you have a read/write quorum for your cluster. If one of those nodes goes offline, you reduce the operational capacity of the cluster, but you can still read and write to the cluster. In that case, the cluster still has a read/write quorum.

If two nodes in your cluster go offline, any additional or ongoing write operations fail, but any data that was successfully written to the cluster can be accessed and read. This is called a read quorum.

Finally, if a third node goes offline, the cluster is offline and the data in the cluster becomes unavailable. In this case, you might be able fix it, but the data could be permanently lost depending on the severity of the event. If it is a temporary external power event, and you can bring the three Snowball Edge devices back online and unlock all the nodes in the cluster, your data becomes available again.

Important

If a minimum quorum of healthy nodes doesn't exist, contact Amazon Web Services Support.

You can determine the quorum state of your cluster by determining your node's lock state and network reachability. The snowballEdge describe-cluster command reports back the lock and network reachability state for every node in an unlocked cluster. Ensuring that the devices in your cluster are healthy and connected is an administrative responsibility that you take on when you create the cluster job. For more information about the different client commands, see Commands for the Snowball Edge Client.

Considerations for Cluster Jobs for Amazon Snowball Edge

Keep the following considerations in mind when planning to use a cluster of Snowball Edge devices:

  • We recommend that you have a redundant power supply to reduce potential performance and stability issues for your cluster.

  • Like standalone local storage and compute jobs, the data stored in a cluster can't be imported into Amazon S3 without ordering additional devices as a part of separate import jobs. If you order these devices, you can transfer the data from the cluster to the devices and import the data when you return the devices for the import jobs.

  • To get data onto a cluster from Amazon S3, create a separate export job and copy the data from the devices of the export job onto the cluster.

  • You can use the console, the Amazon CLI, or Amazon SDK to create a cluster job.

  • Cluster nodes have node IDs. A node ID is the same as the job ID for a device that you can get from the console, the Amazon CLI, the Amazon SDKs, or the Snowball Edge client. You can use node IDs to remove old nodes from clusters. You can get a list of node IDs by using the snowballEdge describe-device command on an unlocked device or the describe-cluster on an unlocked cluster.

  • The lifespan of a cluster is limited by the security certificate granted to the cluster devices when the cluster is provisioned.

  • When Amazon receives a returned device that was part of a cluster, we perform a complete erasure of the device. This erasure follows the National Institute of Standards and Technology (NIST) 800-88 standards.