Overview of data repositories - FSx for Lustre
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Overview of data repositories

When you use Amazon FSx for Lustre with data repositories, you can ingest and process large volumes of file data in a high-performance file system by using automatic import and import data repository tasks. At the same time, you can write results to your data repositories by using automatic export or export data repository tasks. With these features, you can restart your workload at any time using the latest data stored in your data repository.

Note

Data repository associations, automatic export, and support for multiple data repositories aren't available on FSx for Lustre 2.10 file systems or Scratch 1 file systems.

FSx for Lustre is deeply integrated with Amazon S3. This integration means that you can seamlessly access the objects stored in your Amazon S3 buckets from applications that mount your FSx for Lustre file system. You can also run your compute-intensive workloads on Amazon EC2 instances in the Amazon Web Services Cloud and export the results to your data repository after your workload is complete.

In order to access objects in the Amazon S3 data repository as files and directories on the file system, file and directory metadata must be loaded into the file system. You can load metadata from a linked data repository when you create a data repository association.

Additionally you can import file and directory metadata from your linked data repositories to the file system using automatic import or using an import data repository task. When you turn on automatic import for a data repository association, your file system automatically imports file metadata as files are created, modified, and/or deleted in the S3 data repository. Alternatively, you can import metadata for new or changed files and directories using an import data repository task.

Note

Automatic import and import data repository tasks can be used simultaneously on a file system.

You can also export files and their associated metadata in your file system to your data repository using automatic export or using an export data repository task. When you turn on automatic export on a data repository association, your file system automatically exports file data and metadata as files are created, modified, or deleted. Alternatively, you can export files or directories using an export data repository task. When you use an export data repository task, file data and metadata that were created or modified since the last such task are exported.

Note
  • Automatic export and export data repository tasks can't be used simultaneously on a file system.

  • Data repository associations only export regular files, symlinks and directories. This means all the other type of files (FIFO special, block special, character special, and socket) won't be exported as part of the export processes like automatic export and export data repository tasks.

FSx for Lustre also supports cloud bursting workloads with on-premises file systems by enabling you to copy data from on-premises clients using Amazon Direct Connect or VPN.

Important

If you have linked one or more FSx for Lustre file systems to a data repository on Amazon S3, don't delete the Amazon S3 bucket until you have deleted or unlinked all linked file systems.