Step 1: Enable requester pays on an Amazon S3 bucket and add a bucket policy Step 2: Create an IAM policy and attach it to an IAM role Step 3: Add an Athena for Spark session property

Enable requester pays Amazon S3 buckets in Athena for Spark

When an Amazon S3 bucket is configured as requester pays, the account of the user running the query is charged for data access and data transfer fees associated with the query. For more information, see Using Requester Pays buckets for storage transfers and usage in the Amazon S3 User Guide.

In Athena for Spark, requester pays buckets are enabled per session, not per workgroup. At a high level, enabling requester pays buckets includes the following steps:

In the Amazon S3 console, enable requester pays on the properties for the bucket and add a bucket policy to specify access.
In the IAM console, create an IAM policy to allow access to the bucket, and then attach the policy to the IAM role that will be used to access the requester pays bucket.
In Athena for Spark, add a session property to enable the requester pays feature.

Step 1: Enable requester pays on an Amazon S3 bucket and add a bucket policy

To enable requester pays on an Amazon S3 bucket

Open the Amazon S3 console at https://console.amazonaws.cn/s3/.
In the list of buckets, choose the link for the bucket that you want to enable requester pays for.
On the bucket page, choose the Properties tab.
Scroll down to the Requester pays section, and then choose Edit.
On the Edit requester pays page, choose Enable, and then choose Save changes.
Choose the Permissions tab.
In the Bucket policy section, choose Edit.

On the Edit bucket policy page, apply the bucket policy that you want to the source bucket. The following example policy gives access to all Amazon principals ("AWS": "*" ), but your access can be more granular. For example, you might want to specify only a specific IAM role in another account.


{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Statement1",
            "Effect": "Allow",
            "Principal": {
                "AWS": "*"
            },
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:::account_number-us-east-1-my-s3-requester-pays-bucket",
                "arn:aws:s3:::account_number-us-east-1-my-s3-requester-pays-bucket/*"
            ]
        }
    ]
}

Step 2: Create an IAM policy and attach it to an IAM role

Next, you create an IAM policy to allow access to the bucket. Then you attach the policy to the role that will be used to access the requester pays bucket.

To create an IAM policy for the requester pays bucket and attach the policy to a role

Open the IAM console at https://console.amazonaws.cn/iam/.
In the IAM console navigation pane, choose Policies.
Choose Create policy.
Choose JSON.

In the Policy editor, add a policy like the following:


{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "s3:*"
            ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:s3:::account_number-us-east-1-my-s3-requester-pays-bucket",
                "arn:aws:s3:::account_number-us-east-1-my-s3-requester-pays-bucket/*"
            ]
        }
    ]
}

Choose Next.
On the Review and create page, enter a name for the policy and an optional description, and then choose Create policy.
In the navigation pane, choose Roles.
On the Roles page, find the role that you want to use, and then choose the role name link.
In the Permissions policies section, choose Add permissions, Attach policies.
In the Other permissions policies section, select the check box for the policy that you created, and then choose Add permissions.

Step 3: Add an Athena for Spark session property

After you have configured the Amazon S3 bucket and associated permissions for requester pays, you can enable the feature in an Athena for Spark session.

To enable requester pays buckets in an Athena for Spark session

In the notebook editor, from the Session menu on the upper right, choose Edit session.
Expand Spark properties.
Choose Edit in JSON.

In the JSON text editor, enter the following:


{
  "spark.hadoop.fs.s3.useRequesterPaysHeader":"true"
}

Choose Save.

Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Monitor Apache Spark calculations

Enable Spark encryption