Configuring and using Mountpoint - Amazon Simple Storage Service
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Configuring and using Mountpoint

To use Mountpoint for Amazon S3, your host needs valid Amazon credentials with access to the bucket or buckets that you would like to mount. For different ways to authenticate, see Mountpoint Amazon Credentials on GitHub.

For example, you can create a new Amazon Identity and Access Management (IAM) user and role for this purpose. Make sure that this role has access to the bucket or buckets that you would like to mount. You can pass the IAM role to your Amazon EC2 instance with an instance profile.

Using Mountpoint for Amazon S3

Use Mountpoint for Amazon S3 to do the following:

  1. Mount buckets with the mount-s3 command.

    In the following example, replace DOC-EXAMPLE-BUCKET with the name of your S3 bucket, and replace ~/mnt with the directory on your host where you want your S3 bucket to be mounted.

    mkdir ~/mnt mount-s3 DOC-EXAMPLE-BUCKET ~/mnt

    Because the Mountpoint client runs in the background by default, the ~/mnt directory now gives you access to the objects in your S3 bucket.

  2. Access the objects in your bucket through Mountpoint.

    After you mount your bucket locally, you can use common Linux commands, such as cat or ls, to work with your S3 objects. Mountpoint for Amazon S3 interprets keys in your S3 bucket as file system paths by splitting them on the forward slash (/) character. For example, if you have the object key Data/2023-01-01.csv in your bucket, you will have a directory named Data in your Mountpoint file system, with a file named 2023-01-01.csv inside it.

  3. Unmount your bucket by using the umount command. This command unmounts your S3 bucket and exits Mountpoint.

    To use the following example command, replace ~/mnt with the directory on your host where your S3 bucket is mounted.

    umount ~/mnt
    Note

    To get a list of options for this command, run umount --help.

For additional Mountpoint configuration details, see S3 bucket configuration, and file system configuration on GitHub.

Configuring caching in Mountpoint

When you use Mountpoint for Amazon S3, you can configure it to cache the most recently accessed data from your S3 buckets on Amazon EC2 instance storage or an attached Amazon EBS volume. Caching this data can help to accelerate performance and reduce the cost of repeated data access. Caching in Mountpoint is ideal for use cases where you repeatedly read the same data that doesn’t change during the multiple reads. For example, you can use caching with machine learning training jobs that need to read a training dataset multiple times to improve model accuracy.

When you mount an S3 bucket, you can optionally enable caching through flags. You can configure the location and size of the data cache and the amount of time metadata is retained in the cache. When you mount a bucket and caching is enabled, Mountpoint creates an empty sub-directory at the configured cache location, if that sub-directory doesn’t already exist. When you first mount a bucket and when you unmount, Mountpoint deletes the contents of the cache location. For more information about configuring and using caching in Mountpoint, see Mountpoint for Amazon S3 Caching configuration on GitHub.

When you mount an S3 bucket, you can enable caching with the --cache CACHE_PATH flag. In the following example, replace CACHE_PATH with the filepath to the directory that you want to cache your data in. Replace DOC-EXAMPLE-BUCKET with the name of your S3 bucket, and replace ~/mnt with the directory on your host where you want your S3 bucket to be mounted.

mkdir ~/mnt mount-s3 --cache CACHE_PATH DOC-EXAMPLE-BUCKET ~/mnt
Important

If you enable caching, Mountpoint will persist unencrypted object content from your S3 bucket at the caching location configured at mount. In order to protect your data, we recommend that you restrict access to the data cache location.