[fsx]
section
Defines configuration settings for an attached FSx for Lustre file system. For more information, see Amazon FSx CreateFileSystem in the Amazon FSx API Reference.
If the base_os is alinux2
, centos7
, ubuntu1804
, or
ubuntu2004
, FSx for Lustre is supported.
When using Amazon Linux, the kernel must be 4.14.104-78.84.amzn1.x86_64
or a later version. For instructions, see Installing the lustre client in the
Amazon FSx for Lustre User Guide.
Note
FSx for Lustre isn't currently supported when using awsbatch
as a scheduler.
Note
Support for FSx for Lustre on centos8
was removed in Amazon ParallelCluster version 2.10.4. Support for FSx for Lustre on
ubuntu2004
was added in Amazon ParallelCluster version 2.11.0. Support for FSx for Lustre on centos8
was added in
Amazon ParallelCluster version 2.10.0. Support for FSx for Lustre on alinux2
, ubuntu1604
, and ubuntu1804
was added
in Amazon ParallelCluster version 2.6.0. Support for FSx for Lustre on centos7
was added in Amazon ParallelCluster
version 2.4.0.
If using an existing file system, it must be associated to a security group that allows inbound TCP traffic to port 988
. Setting
the source to 0.0.0.0/0
on a security group rule provides client access from all the IP ranges within your VPC security group for the
protocol and port range for that rule. To further limit access to your file systems, we recommend using more restrictive sources for your security
group rules. For example, you can use more specific CIDR ranges, IP addresses, or security group IDs. This is done automatically when not using
vpc_security_group_id.
To use an existing Amazon FSx file system for long-term permanent storage that's independent of the cluster life cycle, specify fsx_fs_id.
If you don't specify fsx_fs_id, Amazon ParallelCluster creates the FSx for Lustre file system from the
[fsx]
settings when it creates the cluster and deletes the file system and data when the cluster is deleted.
For more information, see Best practices: moving a cluster to a new Amazon ParallelCluster minor or patch version.
The format is [fsx
. fsx-name
]fsx-name
must start with a letter, contain no
more than 30 characters, and only contain letters, numbers, hyphens (-), and underscores (_).
[fsx fs] shared_dir = /fsx fsx_fs_id =
fs-073c3803dca3e28a6
To create and configure a new file system, use the following parameters:
[fsx fs] shared_dir = /fsx storage_capacity = 3600 imported_file_chunk_size = 1024 export_path = s3://
bucket/folder
import_path = s3://bucket
weekly_maintenance_start_time = 1:00:00
Topics
- auto_import_policy
- automatic_backup_retention_days
- copy_tags_to_backups
- daily_automatic_backup_start_time
- data_compression_type
- deployment_type
- drive_cache_type
- export_path
- fsx_backup_id
- fsx_fs_id
- fsx_kms_key_id
- import_path
- imported_file_chunk_size
- per_unit_storage_throughput
- shared_dir
- storage_capacity
- storage_type
- weekly_maintenance_start_time
auto_import_policy
(Optional) Specifies the automatic import policy for reflecting changes in the S3 bucket used to create the FSx for Lustre file system. The possible values are the following:
NEW
-
FSx for Lustre automatically imports directory listings of any new objects that are added to the linked S3 bucket that don't currently exist in the FSx for Lustre file system.
NEW_CHANGED
-
FSx for Lustre automatically imports file and directory listings of any new objects that are added to the S3 bucket and any existing objects that are changed in the S3 bucket.
This corresponds to the AutoImportPolicy property. For more information, see Automatically import updates from your S3 bucket in the Amazon FSx for Lustre User Guide. When the auto_import_policy parameter is specified, the automatic_backup_retention_days, copy_tags_to_backups, daily_automatic_backup_start_time, and fsx_backup_id parameters must not be specified.
If the auto_import_policy
setting isn't specified, automatic imports are disabled. FSx for Lustre only updates file and directory
listings from the linked S3 bucket when the file system is created.
auto_import_policy = NEW_CHANGED
Note
Support for auto_import_policy was added in Amazon ParallelCluster version 2.10.0.
Update policy: If this setting is changed, the update is not allowed.
automatic_backup_retention_days
(Optional) Specifies the number of days to retain automatic backups. This is only valid for use with
PERSISTENT_1
deployment types. When the automatic_backup_retention_days parameter is specified, the auto_import_policy, export_path, import_path, and imported_file_chunk_size parameters must not
be specified. This corresponds to the AutomaticBackupRetentionDays property.
The default value is 0. This setting disables automatic backups. The possible values are integers between 0 and 35, inclusive.
automatic_backup_retention_days = 35
Note
Support for automatic_backup_retention_days was added in Amazon ParallelCluster version 2.8.0.
Update policy: This setting can be changed during an update.
copy_tags_to_backups
(Optional) Specifies whether tags for the filesystem are copied to the backups. This is only valid for use
with PERSISTENT_1
deployment types. When the copy_tags_to_backups
parameter is specified, the automatic_backup_retention_days must be
specified with a value greater than 0, and the auto_import_policy, export_path, import_path, and imported_file_chunk_size parameters must not be specified. This corresponds to the
CopyTagsToBackups property.
The default value is false
.
copy_tags_to_backups = true
Note
Support for copy_tags_to_backups was added in Amazon ParallelCluster version 2.8.0.
Update policy: If this setting is changed, the update is not allowed.
daily_automatic_backup_start_time
(Optional) Specifies the time of day (UTC) to start automatic backups. This is only valid for use with
PERSISTENT_1
deployment types. When the daily_automatic_backup_start_time parameter is specified, the automatic_backup_retention_days must be specified with a value greater than 0, and the auto_import_policy, export_path,
import_path, and imported_file_chunk_size parameters must not be specified. This corresponds to the DailyAutomaticBackupStartTime property.
The format is HH:MM
, where HH
is the zero-padded hour of the day (0-23), and MM
is the zero-padded
minute of the hour. For example, 1:03 A.M. UTC is the following.
daily_automatic_backup_start_time = 01:03
The default value is a random time between 00:00
and 23:59
.
Note
Support for daily_automatic_backup_start_time was added in Amazon ParallelCluster version 2.8.0.
Update policy: This setting can be changed during an update.
data_compression_type
(Optional) Specifies the FSx for Lustre data compression type. This corresponds to the DataCompressionType property. For more information, see FSx for Lustre data compression in the Amazon FSx for Lustre User Guide.
The only valid value is LZ4
. To disable data compression, remove the data_compression_type parameter.
data_compression_type = LZ4
Note
Support for data_compression_type was added in Amazon ParallelCluster version 2.11.0.
Update policy: This setting can be changed during an update.
deployment_type
(Optional) Specifies the FSx for Lustre deployment type. This corresponds to the DeploymentType property. For more information, see FSx for Lustre deployment options in the Amazon FSx for Lustre User Guide. Choose a scratch deployment type for temporary storage
and shorter-term processing of data. SCRATCH_2
is the latest generation of scratch file systems. It offers higher burst throughput
over baseline throughput and the in-transit encryption of data.
The valid values are SCRATCH_1
, SCRATCH_2
, and PERSISTENT_1
.
SCRATCH_1
-
The default deployment type for FSx for Lustre. With this deployment type, the storage_capacity setting has possible values of 1200, 2400, and any multiple of 3600. Support for
SCRATCH_1
was added in Amazon ParallelCluster version 2.4.0. SCRATCH_2
-
The latest generation of scratch file systems. It supports up to six times the baseline throughput for spiky workloads. It also supports in-transit encryption of data for supported instance types in supported Amazon Web Services Regions. For more information, see Encrypting data in transit in the Amazon FSx for Lustre User Guide. With this deployment type, the storage_capacity setting has possible values of 1200 and any multiple of 2400. Support for
SCRATCH_2
was added in Amazon ParallelCluster version 2.6.0. PERSISTENT_1
-
Designed for longer-term storage. The file servers are highly available and the data is replicated within the file systems' Amazon Availability Zone. It supports in-transit encryption of data for supported instance types. With this deployment type, the storage_capacity setting has possible values of 1200 and any multiple of 2400. Support for
PERSISTENT_1
was added in Amazon ParallelCluster version 2.6.0.
The default value is SCRATCH_1
.
deployment_type = SCRATCH_2
Note
Support for deployment_type was added in Amazon ParallelCluster version 2.6.0.
Update policy: If this setting is changed, the update is not allowed.
drive_cache_type
(Optional) Specifies that the file system has an SSD drive cache. This can only be set if the storage_type setting is set to HDD
. This corresponds to the DriveCacheType property. For more information, see FSx for Lustre deployment options in the Amazon FSx for Lustre User Guide.
The only valid value is READ
. To disable the SSD drive cache, don’t specify the drive_cache_type
setting.
drive_cache_type = READ
Note
Support for drive_cache_type was added in Amazon ParallelCluster version 2.10.0.
Update policy: If this setting is changed, the update is not allowed.
export_path
(Optional) Specifies the Amazon S3 path where the root of your file system is exported. When the export_path parameter is specified, the automatic_backup_retention_days, copy_tags_to_backups, daily_automatic_backup_start_time, and fsx_backup_id parameters must not be specified. This corresponds to the ExportPath property. File data and metadata isn't automatically exported to the export_path
. For information about exporting
data and metadata, see Exporting changes to the data
repository in the Amazon FSx for Lustre User Guide.
The default value is s3://
,
where import-bucket
/FSxLustre[creation-timestamp]
is the bucket provided in the import_path parameter.import-bucket
export_path = s3://
bucket/folder
Update policy: If this setting is changed, the update is not allowed.
fsx_backup_id
(Optional) Specifies the ID of the backup to use for restoring the file system from an existing backup. When the fsx_backup_id parameter is specified, the auto_import_policy, deployment_type, export_path, fsx_kms_key_id, import_path, imported_file_chunk_size, storage_capacity, and per_unit_storage_throughput parameters must not be specified. These parameters are read from the backup. Additionally, the auto_import_policy, export_path, import_path, and imported_file_chunk_size parameters must not be specified.
This corresponds to the BackupId property.
fsx_backup_id = backup-fedcba98
Note
Support for fsx_backup_id was added in Amazon ParallelCluster version 2.8.0.
Update policy: If this setting is changed, the update is not allowed.
fsx_fs_id
(Optional) Attaches an existing FSx for Lustre file system.
If this option is specified, only the shared_dir and fsx_fs_id settings in the [fsx] section are used and any other settings in the [fsx] section are ignored.
fsx_fs_id = fs-073c3803dca3e28a6
Update policy: If this setting is changed, the update is not allowed.
fsx_kms_key_id
(Optional) Specifies the key ID of your Amazon Key Management Service (Amazon KMS) customer managed key.
This key is used to encrypt the data in your file system at rest.
This must be used with a custom ec2_iam_role. For more information, see Disk encryption with a custom KMS Key. This corresponds to the KmsKeyId parameter in the Amazon FSx API Reference.
fsx_kms_key_id =
xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
Note
Support for fsx_kms_key_id was added in Amazon ParallelCluster version 2.6.0.
Update policy: If this setting is changed, the update is not allowed.
import_path
(Optional) Specifies the S3 bucket to load data from into the file system and serve as the export bucket. For more information, see export_path. If you specify the import_path parameter, the automatic_backup_retention_days, copy_tags_to_backups, daily_automatic_backup_start_time, and fsx_backup_id parameters must not be specified. This corresponds to the ImportPath parameter in the Amazon FSx API Reference.
Import occurs on cluster creation. For more information, see Importing data from your data repository in the Amazon FSx for Lustre User Guide. On import, only file metadata (name, ownership, timestamp, and permissions) is imported. File data isn't imported from the S3 bucket until the file is first accessed. For information about preloading the file contents, see Preloading files into your file system in the Amazon FSx for Lustre User Guide.
If a value isn't provided, the file system is empty.
import_path = s3://
bucket
Update policy: If this setting is changed, the update is not allowed.
imported_file_chunk_size
(Optional) Determines the stripe count and the maximum amount of data for each file (in MiB) stored on a single physical disk for files that are imported from a data repository (using import_path). The maximum number of disks that a single file can be striped across is limited by the total number of disks that make up the file system. When the imported_file_chunk_size parameter is specified, the automatic_backup_retention_days, copy_tags_to_backups, daily_automatic_backup_start_time, and fsx_backup_id parameters must not be specified. This corresponds to the ImportedFileChunkSize property.
The chunk size default is 1024
(1 GiB), and it can go as high as 512,000 MiB (500 GiB). Amazon S3 objects have a maximum size of 5
TB.
imported_file_chunk_size = 1024
Update policy: If this setting is changed, the update is not allowed.
per_unit_storage_throughput
(Required for PERSISTENT_1
deployment types) For the deployment_type = PERSISTENT_1
deployment type, describes the amount of read and write throughput for each 1 tebibyte
(TiB) of storage, in MB/s/TiB. File system throughput capacity is calculated by multiplying file system storage capacity (TiB) by the per_unit_storage_throughput (MB/s/TiB). For a 2.4 TiB file system, provisioning 50
MB/s/TiB of per_unit_storage_throughput yields 120 MB/s of file system
throughput. You pay for the amount of throughput that you provision. This corresponds to the PerUnitStorageThroughput property.
The possible values depend on the value of the storage_type setting.
storage_type = SSD
-
The possible values are 50, 100, 200.
storage_type = HDD
-
The possible values are 12, 40.
per_unit_storage_throughput = 200
Note
Support for per_unit_storage_throughput was added in Amazon ParallelCluster version 2.6.0.
Update policy: If this setting is changed, the update is not allowed.
shared_dir
(Required) Defines the mount point for the FSx for Lustre file system on the head and compute nodes.
Don't use NONE
or /NONE
as the shared directory.
The following example mounts the file system at /fsx
.
shared_dir = /fsx
Update policy: If this setting is changed, the update is not allowed.
storage_capacity
(Required) Specifies the storage capacity of the file system, in GiB. This corresponds to the StorageCapacity property.
The storage capacity possible values vary based on the deployment_type setting.
SCRATCH_1
-
The possible values are 1200, 2400, and any multiple of 3600.
SCRATCH_2
-
The possible values are 1200 and any multiple of 2400.
PERSISTENT_1
-
The possible values vary based on the values of other settings.
storage_type = SSD
-
The possible values are 1200 and any multiple of 2400.
storage_type = HDD
-
The possible values vary based on the setting of the per_unit_storage_throughput setting.
per_unit_storage_throughput = 12
-
The possible values are any multiple of 6000.
per_unit_storage_throughput = 40
-
The possible values are any multiple of 1800.
storage_capacity = 7200
Note
For Amazon ParallelCluster version 2.5.0 and 2.5.1, storage_capacity supported possible values of 1200, 2400, and any multiple of 3600. For versions earlier than Amazon ParallelCluster version 2.5.0, storage_capacity had a minimum size of 3600.
Update policy: If this setting is changed, the update is not allowed.
storage_type
(Optional) Specifies the storage type of the file system. This corresponds to the StorageType property. The
possible values are SSD
and HDD
. The default is SSD
.
The storage type changes the possible values of other settings.
storage_type = SSD
-
Specifies a sold-state drive (SSD) storage type.
storage_type = SSD
changes the possible values of several other settings.- drive_cache_type
-
This setting cannot be specified.
- deployment_type
-
This setting can be set to
SCRATCH_1
,SCRATCH_2
, orPERSISTENT_1
. - per_unit_storage_throughput
-
This setting must be specified if deployment_type is set to
PERSISTENT_1
. The possible values are 50, 100, or 200. - storage_capacity
-
This setting must be specified. The possible values vary based on deployment_type.
deployment_type = SCRATCH_1
-
storage_capacity can be 1200, 2400, or any multiple of 3600.
deployment_type = SCRATCH_2
ordeployment_type = PERSISTENT_1
-
storage_capacity can be 1200 or any multiple of 2400.
storage_type = HDD
-
Specifies a hard disk drive (HDD) storage type.
storage_type = HDD
changes the possible values of other settings.- drive_cache_type
-
This setting can be specified.
- deployment_type
-
This setting must be set to
PERSISTENT_1
. - per_unit_storage_throughput
-
This setting must be specified. The possible values are 12, or 40.
- storage_capacity
-
This setting must be specified. The possible values vary based on the per_unit_storage_throughput setting.
storage_capacity = 12
-
storage_capacity can be any multiple of 6000.
storage_capacity = 40
-
storage_capacity can be any multiple of 1800.
storage_type = SSD
Note
Support for the storage_type setting was added in Amazon ParallelCluster version 2.10.0.
Update policy: If this setting is changed, the update is not allowed.
weekly_maintenance_start_time
(Optional) Specifies a preferred time to perform weekly maintenance, in the UTC time zone. This corresponds to the WeeklyMaintenanceStartTime property.
The format is [day of week]:[hour of day]:[minute of hour]. For example, Monday at Midnight is as follows.
weekly_maintenance_start_time = 1:00:00
Update policy: This setting can be changed during an update.