Metadata copied by Amazon DataSync
How Amazon DataSync handles your file or object metadata during a transfer depends on what storage systems you're working with.
DataSync doesn't copy system-level settings. For example, when copying objects, DataSync doesn't copy your storage system's encryption setting. If you're copying from an SMB share, DataSync doesn't copy the permissions you configured at the file system level.
Metadata copied between systems with similar metadata structures
DataSync preserves metadata between storage systems that have a similar metadata structure.
NFS transfers
The following table describes what metadata DataSync can copy between locations that use Network File System (NFS).
When copying between these locations | DataSync can copy |
---|---|
|
|
SMB transfers
The following table describes what metadata DataSync can copy between locations that use Server Message Block (SMB).
When copying between these locations | DataSync can copy |
---|---|
|
|
HDFS transfers
The following table describes what metadata DataSync can copy when a transfer involves a Hadoop Distributed File System (HDFS) location.
When copying from this location | To one of these locations | DataSync can copy |
---|---|---|
|
|
HDFS uses strings to store file and folder user and group ownership, rather than numeric identifiers (such as UIDs and GIDs). Default values for UIDs and GIDs are applied on the destination file system. For more information about default values, see Default POSIX metadata applied by DataSync. |
Amazon S3 transfers
The following tables describe what metadata DataSync can copy when a transfer involves an Amazon S3 location.
To Amazon S3
When copying from one of these locations | To this location | DataSync can copy |
---|---|---|
|
|
The following as Amazon S3 user metadata:
The file metadata stored in Amazon S3 user metadata is interoperable with NFS shares on file gateways using Amazon Storage Gateway. A file gateway enables low-latency access from on-premises networks to data that was copied to Amazon S3 by DataSync. This metadata is also interoperable with FSx for Lustre. When DataSync copies objects that contain this metadata back to an NFS server, the file metadata is restored. Restoring metadata requires granting elevated permissions to the NFS server. For more information, see Creating an NFS location for Amazon DataSync. |
Between HDFS and Amazon S3
When copying between these locations | DataSync can copy |
---|---|
|
The following as Amazon S3 user metadata:
|
Between object storage and Amazon S3
When copying between these locations | DataSync can copy |
---|---|
|
DataSync doesn't copy other object metadata, such as object access control lists (ACLs) or prior object versions. Important: If you're transferring objects from a Google Cloud Storage bucket, copying object tags may cause your DataSync task to fail. To prevent this, deselect the Copy object tags option when configuring your task settings. For more information, see Managing how Amazon DataSync transfers files, objects, and metadata. |
Metadata copied between systems with different metadata structures
When copying between storage systems that don't have a similar metadata structure, DataSync handles metadata using the following rules.
When copying from these locations | To these locations | DataSync can copy |
---|---|---|
|
|
Default POSIX metadata for all files and folders on the destination file system or objects in the destination S3 bucket. This approach includes using the default POSIX user ID and group ID values. Windows-based metadata (such as ACLs) is not preserved. |
|
|
|
|
|
File and folder timestamps from the source location. The file or folder owner is set based on the HDFS user or Kerberos principal you specified when creating the HDFS location. The Groups Mapping configuration on the Hadoop cluster determines the group. |
|
|
File and folder timestamps from the source location. Ownership is set based on the Windows user that was specified in DataSync to access the Amazon FSx or SMB share. Permissions are inherited from the parent directory. |
|
|
Default POSIX metadata applied by DataSync
When your source and destination locations don't have a similar metadata structure, or when source metadata is missing, DataSync applies default POSIX metadata.
This is how DataSync applies default POSIX metadata specifically in these situations:
-
When transferring from Amazon S3 or object storage (in cases where Amazon S3 objects don't have DataSync POSIX metadata) to Amazon EFS, FSx for Lustre, FSx for OpenZFS, FSx for ONTAP (using NFS), NFS, or HDFS
-
When transferring from SMB to an NFS, HDFS, Amazon S3, FSx for Lustre, FSx for OpenZFS, FSx for ONTAP (using NFS), or Amazon EFS
The following table describes the default POSIX metadata and permissions that DataSync applies.
Permission | Value |
---|---|
UID |
65534 |
GID |
65534 |
Folder Permission |
0755 |
File Permission |
0644 |
HDFS stores file and folder user and group ownership using strings rather than numeric identifiers (such as UIDs and GIDs). When there's no equivalent metadata on the source location, file and folder ownership is set based on the HDFS user or Kerberos principal that you specified when creating the DataSync location. The group is determined by the Groups Mapping configuration on the Hadoop cluster.