Monitoring your Amazon DataSync transfers with task reports - Amazon DataSync
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Monitoring your Amazon DataSync transfers with task reports

Task reports provide detailed information about what Amazon DataSync attempts to transfer, skip, verify, and delete during a task execution.

Task reports are generated in JSON format. You can customize the level of detail in your reports:

  • Summary only task reports give you the necessary details about your task execution, such as how many files transferred and whether DataSync could verify the data integrity of those files.

  • Standard task reports include a summary plus detailed reports that list each file, object, or folder that DataSync attempts to transfer, skip, verify, and delete. With a standard task report, you can also specify the report level to show only the task execution's errors or its successes and errors.

Use cases

Here are some situations where task reports can help you monitor and audit your data transfers:

  • When migrating millions of files, quickly identify files that DataSync has issues transferring.

  • Verify chain-of-custody processes for your files.

Summary only task reports

A report that's only a summary of a task execution includes the following details:

  • The Amazon Web Services account that ran the task execution

  • The source and destination locations

  • The total number of files, objects, and folders that were skipped, transferred, verified, and deleted

  • The total bytes (logical and physical) that were transferred

  • If the task execution was completed, canceled, or encountered an error

  • The start and end times (including the total time of the transfer)

  • The task's settings (such as bandwidth limits, data integrity verification, and other options for your DataSync transfer)

Standard task reports

A standard task report includes a summary of your task execution plus detailed reports of what DataSync attempts to transfer, skip, verify, and delete.

Report level

With standard task reports, you can choose one of the following report levels:

  • Errors only

  • Successes and errors (essentially a list of everything that happened during your task execution)

For example, you might want to see which files DataSync skipped successfully during your transfer and which ones it didn't. Files that DataSync skipped successfully might be ones that you purposely want DataSync to exclude because they already exist in your destination location. However, a skipped error for instance might indicate that DataSync doesn't have the right permissions to read a file.

Transferred reports

A list of files, objects, and directories that DataSync attempted to transfer during your task execution. A transferred report includes the following details:

  • The paths for the transferred data

  • What was transferred (content, metadata, or both)

  • The metadata, which includes the data type, content size (objects and files only), and more

  • The time when an item was transferred

  • The object version (if the destination is an Amazon S3 bucket that has versioning enabled)

  • If something was overwritten in the destination

  • Whether an item transferred successfully

Note

When moving data between S3 buckets, the prefix that you specify in your source location can show up in your report (or in Amazon CloudWatch logs), even if that prefix doesn't exist as an object in your destination location. (In the DataSync console, you might also notice this prefix showing up as skipped or verified data.)

Skipped reports

A list of files, objects, and directories that DataSync discovered in your source location but didn't attempt to transfer. The reasons DataSync skips data can depend on several factors, such as how you configure your task and file permissions. Here are some examples:

  • There's a file that exists in your source and destination locations. The file in the source hasn't been modified since the previous task execution. Since you're only transferring data that has changed, DataSync skips that file and doesn't transfer it during your next task execution.

  • An object that exists in your source and destination locations changes in your source. When you run your task, DataSync skips this object in your destination because your task doesn't overwrite data in the destination.

  • DataSync skips a directory in your source location because it can't read it.

    If this happens and isn't expected, check your access permissions and make sure that DataSync can read what was skipped.

A skipped report includes the following details:

  • The paths for skipped data

  • The time when an item was skipped

  • The reason it was skipped

  • Whether an item was skipped successfully

Note

Skipped reports can be large when they include successes and errors, you configure your task to transfer only the data that has changed, and source data already exists in the destination.

Verified reports

A list of files, objects, and directories that DataSync attempted to verify the integrity of during your task execution. A verified data report includes the following details:

  • The paths for verified data

  • The time when an item was verified

  • The reason for the verification error (if any)

  • The source and destination SHA256 checksums (files only)

  • Whether an item was successfully verified

Note

When you configure your task to verify only the data that's transferred, DataSync doesn't verify directories in some situations or files that fail to transfer. In either case, DataSync doesn't include unverified data in this report.

Deleted reports

A list of files, directories, and objects that were deleted during your task execution. DataSync generates this report only if you configure your task to delete data in the destination location that isn't in the source. A deleted data report includes the following details:

  • The paths for deleted data

  • Whether an item was successfully deleted

  • The time when an item was deleted

Example task reports

The level of detail in your task report is up to you. Here are some example transferred data reports with the following configuration:

  • Report type – Standard

  • Report level – Successes and errors

Note

Reports use the ISO-8601 standard for the timestamp format. Times are in UTC and measured in nanoseconds. This behavior differs from how some other task report metrics are measured. For example, task execution details, such as TransferDuration and VerifyDuration, are measured in milliseconds.

Example transferred data report with success status

This report shows that an object named object1.txt successfully transferred.

{ "TaskExecutionId": "exec-abcdefgh12345678", "Transferred": [{ "RelativePath": "/object1.txt", "SrcMetadata": { "Type": "Regular", "ContentSize": 6, "Mtime": "2022-01-07T16:59:26.136114671Z", "Atime": "2022-01-07T16:59:26.136114671Z", "Uid": 0, "Gid": 0, "Mode": "0644" }, "Overwrite": "False", "DstS3VersionId": "jtqRtX3jN4J2G8k0sFSGYK1f35KqpAVP", "TransferTimestamp": "2022-01-07T16:59:45.747270957Z", "TransferType": "CONTENT_AND_METADATA", "TransferStatus": "SUCCESS" }] }
Example transferred data report with error status

This report shows that an object named object1.txt didn't transfer because of an S3 bucket permissions issue. (If you get an error like this, see Providing DataSync access to S3 buckets.)

{ "TaskExecutionId": "exec-abcdefgh12345678", "Transferred": [{ "RelativePath": "/object1.txt", "SrcMetadata": { "Type": "Regular", "ContentSize": 6, "Mtime": "2022-01-07T16:59:26.136114671Z", "Atime": "2022-01-07T16:59:26.136114671Z", "Uid": 0, "Gid": 0, "Mode": "0644" }, "Overwrite": "False", "DstS3VersionId": "jtqRtX3jN4J2G8k0sFSGYK1f35KqpAVP", "TransferTimestamp": "2022-01-07T16:59:45.747270957Z", "TransferType": "CONTENT_AND_METADATA", "TransferStatus": "FAILED", "FailureReason": "S3 Get Object Failed", "FailureCode": 40974 }] }

Prerequisites

Before you can create a task report, you must do the following.

Create an S3 bucket for your task reports

If you don't already have one, create an S3 bucket where DataSync can upload your task report. Reports are stored in the S3 Standard storage class.

We recommend the following for this bucket:

  • If you're planning to transfer data to an S3 bucket, don't use the same bucket for your task report if you disable the Keep deleted files option. Otherwise, DataSync will delete any previous task reports each time you execute a task since those reports don't exist in your source location.

  • To avoid a complex access permissions setup, make sure that your task report bucket is in the same Amazon Web Services account and Region as your DataSync transfer task.

Allow DataSync to upload task reports to your S3 bucket

You must configure an Amazon Identity and Access Management (IAM) role that allows DataSync to upload a task report to your S3 bucket.

In the DataSync console, you can create an IAM role that in most cases automatically includes the permissions to upload a task report to your bucket. Keep in mind that this automatically generated role might not meet your needs from a least-privilege standpoint. This role also won't work if your bucket is encrypted with a customer managed Amazon Key Management Service (Amazon KMS) key (SSE-KMS). In these cases, you can create the role manually as long as the role does at least the following:

  • Prevents the cross-service confused deputy problem in the role's trusted entity.

    The following full example shows how you can use the aws:SourceArn and aws:SourceAccount global condition context keys to prevent the confused deputy problem with DataSync.

    { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": "datasync.amazonaws.com" }, "Action": "sts:AssumeRole", "Condition": { "StringEquals": { "aws:SourceAccount": "123456789012" }, "StringLike": { "aws:SourceArn": "arn:aws-cn:datasync:us-east-2:123456789012:*" } } } ] }
  • Allows DataSync to upload a task report to your S3 bucket.

    The following example does this by including the s3:PutObject action only for a specific prefix (reports/) in your bucket.

    { "Version": "2012-10-17", "Statement": [{ "Action": [ "s3:PutObject" ], "Effect": "Allow", "Resource": "arn:aws-cn:s3:::your-task-reports-bucket/reports/*" }] }
  • If your S3 bucket is encrypted with a customer managed SSE-KMS key, the key's policy must include the IAM role that DataSync uses to access the bucket.

    For more information, see Accessing S3 buckets using server-side encryption.

Creating a summary only task report

You can configure a task report that includes a summary only when creating your DataSync task, starting your task, or updating your task.

The following steps show how to configure a summary only task report when creating a task.

  1. Open the Amazon DataSync console at https://console.amazonaws.cn/datasync/.

  2. In the left navigation pane, expand Data transfer, then choose Tasks, and then choose Create task.

  3. Configure your task's source and destination locations.

    For more information, see Where can I transfer my data with Amazon DataSync?

  4. Scroll down to the Task report section. For Report type, choose Summary only.

  5. For S3 bucket for reports, choose an S3 bucket where you want DataSync to upload your task report.

    Tip

    If you're planning to transfer data to an S3 bucket, don't use the same bucket for your task report if you disable the Keep deleted files option. Otherwise, DataSync will delete any previous task reports each time you execute a task since those reports don't exist in your source location.

  6. For Folder, enter a prefix to use for your task report when DataSync uploads the report to your S3 bucket (for example, reports/).

    Make sure to include the appropriate delimiter character at the end of your prefix. This character is usually a forward slash (/). For more information, see Organizing objects by using prefixes in the Amazon S3 User Guide.

  7. For IAM role, do one of the following:

    • Choose Autogenerate to have DataSync automatically create an IAM role with the permissions that are required to access the S3 bucket.

      If DataSync previously created an IAM role for this S3 bucket, that role is chosen by default.

    • Choose a custom IAM role that you created.

      In some cases, you might need to create the role yourself. For more information, see Allow DataSync to upload task reports to your S3 bucket.

      Important

      If your S3 bucket is encrypted with a customer managed SSE-KMS key, the key's policy must include the IAM role that DataSync uses to access the bucket.

      For more information, see Accessing S3 buckets using server-side encryption.

  8. Finish creating your task, and then start the task to begin transferring your data.

When your transfer is complete, you can view your task report.

  1. Copy the following create-task Amazon Command Line Interface (Amazon CLI) command:

    aws datasync create-task \ --source-location-arn arn:aws-cn:datasync:us-east-1:123456789012:location/loc-12345678abcdefgh \ --destination-location-arn arn:aws-cn:datasync:us-east-1:123456789012:location/loc-abcdefgh12345678 \ --task-report-config '{ "Destination":{ "S3":{ "Subdirectory":"reports/", "S3BucketArn":"arn:aws-cn:s3:::your-task-reports-bucket", "BucketAccessRoleArn":"arn:aws-cn:iam::123456789012:role/bucket-iam-role" } }, "OutputType":"SUMMARY_ONLY" }'
  2. For the --source-location-arn parameter, specify the Amazon Resource Name (ARN) of the source location in your transfer. Replace us-east-1 with the appropriate Amazon Web Services Region, replace 123456789012 with the appropriate Amazon Web Services account number, and replace 12345678abcdefgh with the appropriate source location ID.

  3. For the --destination-location-arn parameter, specify the ARN of the destination location in your transfer. Replace us-east-1 with the appropriate Amazon Web Services Region, replace 123456789012 with the appropriate Amazon Web Services account number, and replace abcdefgh12345678 with the appropriate destination location ID.

  4. For the --task-report-config parameter, do the following:

    • Subdirectory – Replace reports/ with the prefix in your S3 bucket where you want DataSync to upload your task reports.

      Make sure to include the appropriate delimiter character at the end of your prefix. This character is usually a forward slash (/). For more information, see Organizing objects by using prefixes in the Amazon S3 User Guide.

    • S3BucketArn – Specify the ARN of the S3 bucket where you want to upload your task report.

      Tip

      If you're planning to transfer data to an S3 bucket, don't use the same bucket for your task report if you disable the Keep deleted files option. Otherwise, DataSync will delete any previous task reports each time you execute a task since those reports don't exist in your source location.

    • BucketAccessRoleArn – Specify the IAM role that allows DataSync to upload a task report to your S3 bucket.

      For more information, see Allow DataSync to upload task reports to your S3 bucket.

      Important

      If your S3 bucket is encrypted with a customer managed SSE-KMS key, the key's policy must include the IAM role that DataSync uses to access the bucket.

      For more information, see Accessing S3 buckets using server-side encryption.

    • OutputType – Specify SUMMARY_ONLY.

      For more information, see Summary only task reports.

  5. Run the create-task command to create your task.

    You get a response like the following that shows you the ARN of the task that you created. You will need this ARN to run the start-task-execution command.

    { "TaskArn": "arn:aws-cn:datasync:us-east-1:123456789012:task/task-12345678abcdefgh" }
  6. Copy the following start-task-execution command.

    aws datasync-task-report start-task-execution \ --task-arn arn:aws-cn:datasync:us-east-1:123456789012:task/task-12345678abcdefgh
  7. For the --task-arn parameter, specify the ARN of the task that you're starting. Use the ARN that you received from running the create-task command.

  8. Run the start-task-execution command.

When your transfer is complete, you can view your task report.

Creating a standard task report

You can configure a standard task report when creating your DataSync task, starting your task, or updating your task.

The following steps show how to configure a standard task report when creating a task.

  1. Open the Amazon DataSync console at https://console.amazonaws.cn/datasync/.

  2. In the left navigation pane, expand Data transfer, then choose Tasks, and then choose Create task.

  3. Configure your task's source and destination locations.

    For more information, see Where can I transfer my data with Amazon DataSync?

  4. Scroll down to the Task report section. For Report type, choose Standard report.

  5. For Report level, choose one of the following:

    • Errors only – Your task report includes only issues with what DataSync tried to transfer, skip, verify, and delete.

    • Successes and errors – Your task report includes what DataSync successfully transferred, skipped, verified, and deleted and what it didn't.

    • Custom – Allows you to choose whether you want to see errors only or successes and errors for specific aspects of your task report.

      For example, you can choose Successes and errors for the transferred files list but Errors only for the rest of the report.

  6. If you're transferring to an S3 bucket that uses object versioning, keep Include Amazon S3 object versions selected if you want your report to include the new version for each transferred object.

  7. For S3 bucket for reports, choose an S3 bucket where you want DataSync to upload your task report.

    Tip

    If you're planning to transfer data to an S3 bucket, don't use the same bucket for your task report if you disable the Keep deleted files option. Otherwise, DataSync will delete any previous task reports each time you execute a task since those reports don't exist in your source location.

  8. For Folder, enter a prefix to use for your task report when DataSync uploads the report to your S3 bucket (for example, reports/). Make sure to include the appropriate delimiter character at the end of your prefix. This character is usually a forward slash (/). For more information, see Organizing objects by using prefixes in the Amazon S3 User Guide.

  9. For IAM role, do one of the following:

    • Choose Autogenerate to have DataSync automatically create an IAM role with the permissions that are required to access the S3 bucket.

      If DataSync previously created an IAM role for this S3 bucket, that role is chosen by default.

    • Choose a custom IAM role that you created.

      In some cases, you might need to create the role yourself. For more information, see Allow DataSync to upload task reports to your S3 bucket.

      Important

      If your S3 bucket is encrypted with a customer managed SSE-KMS key, the key's policy must include the IAM role that DataSync uses to access the bucket.

      For more information, see Accessing S3 buckets using server-side encryption.

  10. Finish creating your task and start the task to begin transferring your data.

When your transfer is complete, you can view your task report.

  1. Copy the following create-task command:

    aws datasync create-task \ --source-location-arn arn:aws-cn:datasync:us-east-1:123456789012:location/loc-12345678abcdefgh \ --destination-location-arn arn:aws-cn:datasync:us-east-1:123456789012:location/loc-abcdefgh12345678 \ --task-report-config '{ "Destination":{ "S3":{ "Subdirectory":"reports/", "S3BucketArn":"arn:aws-cn:s3:::your-task-reports-bucket", "BucketAccessRoleArn":"arn:aws-cn:iam::123456789012:role/bucket-iam-role" } }, "OutputType":"STANDARD", "ReportLevel":"level-of-detail", "ObjectVersionIds":"include-or-not" }'
  2. For the --source-location-arn parameter, specify the ARN of the source location in your transfer. Replace us-east-1 with the appropriate Amazon Web Services Region, replace 123456789012 with the appropriate Amazon Web Services account number, and replace 12345678abcdefgh with the appropriate source location ID.

  3. For the --destination-location-arn parameter, specify the ARN of the destination location in your transfer. Replace us-east-1 with the appropriate Amazon Web Services Region, replace 123456789012 with the appropriate Amazon Web Services account number, and replace abcdefgh12345678 with the appropriate destination location ID.

  4. For the --task-report-config parameter, do the following:

    • Subdirectory – Replace reports/ with the prefix in your S3 bucket where you want DataSync to upload your task reports. Make sure to include the appropriate delimiter character at the end of your prefix. This character is usually a forward slash (/). For more information, see Organizing objects by using prefixes in the Amazon S3 User Guide.

    • S3BucketArn – Specify the ARN of the S3 bucket where you want to upload your task report.

      Tip

      If you're planning to transfer data to an S3 bucket, don't use the same bucket for your task report if you disable the Keep deleted files option. Otherwise, DataSync will delete any previous task reports each time you execute a task since those reports don't exist in your source location.

    • BucketAccessRoleArn – Specify the IAM role that allows DataSync to upload a task report to your S3 bucket.

      For more information, see Allow DataSync to upload task reports to your S3 bucket.

      Important

      If your S3 bucket is encrypted with a customer managed SSE-KMS key, the key's policy must include the IAM role that DataSync uses to access the bucket.

      For more information, see Accessing S3 buckets using server-side encryption.

    • OutputType – Specify STANDARD report.

      For more information, see Standard task reportsTypes of task reports.

    • (Optional) ReportLevel – Specify whether you want ERRORS_ONLY (the default) or SUCCESSES_AND_ERRORS in your report.

    • (Optional) ObjectVersionIds – If you're transferring to an S3 bucket that uses object versioning, specify NONE if you don't want to include the new version for each transferred object in the report.

      By default, this option is set to INCLUDE.

    • (Optional) Overrides – Customize the ReportLevel of a particular aspect of your report.

      For example, you might want to see SUCCESSES_AND_ERRORS for the list of what DataSync deletes in your destination location, but you want ERRORS_ONLY for everything else. In this example, you would add the following Overrides option to the --task-report-config parameter:

      "Overrides":{ "Deleted":{ "ReportLevel":"SUCCESSES_AND_ERRORS" } }

      If you don't use Overrides, your entire report uses the ReportLevel that you specify.

  5. Run the create-task command to create your task.

    You get a response like the following that shows you the ARN of the task that you created. You will need this ARN to run the start-task-execution command.

    { "TaskArn": "arn:aws-cn:datasync:us-east-1:123456789012:task/task-12345678abcdefgh" }
  6. Copy the following start-task-execution command.

    aws datasync-task-report start-task-execution \ --task-arn arn:aws-cn:datasync:us-east-1:123456789012:task/task-12345678abcdefgh
  7. For the --task-arn parameter, specify the ARN of the task you're running. Use the ARN that you received from running the create-task command.

  8. Run the start-task-execution command.

When your transfer is complete, you can view your task report.

Viewing your task reports

DataSync creates task reports for every task execution. When your execution completes, you can find the related task reports in your S3 bucket. Task reports are organized under prefixes that include the IDs of your tasks and their executions.

To help locate task reports in your S3 bucket, use these examples:

  • Summary only task reportreports-prefix/Summary-Reports/task-id-folder/task-execution-id-folder

  • Standard task reportreports-prefix/Detailed-Reports/task-id-folder/task-execution-id-folder

Because task reports are in JSON format, you have several options for viewing your reports:

  • View a report by using Amazon S3 Select.

  • Visualize reports by using Amazon services such as Amazon Glue, Amazon Athena, and Amazon QuickSight. For more information about visualizing your task reports, see the Amazon Storage Blog.

Limitations

  • Individual task reports can't exceed 5 MB. If you're copying a large number of files, your task report might be split into multiple reports.

  • There are situations when creating task reports can affect the performance of your data transfer. For example, you might notice this when your network connection has high latency and the files you're transferring are small or you're copying only metadata changes.