Using Amazon S3 server access logs to identify requests
You can identify Amazon S3 requests by using Amazon S3 server access logs.
Note
-
To identify Amazon S3 requests, we recommend that you use Amazon CloudTrail data events instead of Amazon S3 server access logs. CloudTrail data events are easier to set up and contain more information. For more information, see Identifying Amazon S3 requests using CloudTrail.
-
Depending on how many access requests you get, analyzing your logs might require more resources or time than using CloudTrail data events.
Topics
Querying access logs for requests by using Amazon Athena
You can identify Amazon S3 requests with Amazon S3 access logs by using Amazon Athena.
Amazon S3 stores server access logs as objects in an S3 bucket. It is often easier to use a tool that can analyze the logs in Amazon S3. Athena supports analysis of S3 objects and can be used to query Amazon S3 access logs.
Example
The following example shows how you can query Amazon S3 server access logs in Amazon Athena.
Replace the
used in the
following examples with your own information.user input placeholders
Note
To specify an Amazon S3 location in an Athena query, you must provide an S3 URI for the
bucket where your logs are delivered to. This URI must include the bucket name and
prefix in the following format:
s3://
amzn-s3-demo-bucket1
-logs/prefix
/
Open the Athena console at https://console.amazonaws.cn/athena/
. -
In the Query Editor, run a command similar to the following. Replace
with the name that you want to give to your database.s3_access_logs_db
CREATE DATABASE
s3_access_logs_db
Note
It's a best practice to create the database in the same Amazon Web Services Region as your S3 bucket.
-
In the Query Editor, run a command similar to the following to create a table schema in the database that you created in step 2. Replace
with the name that you want to give to your table. Thes3_access_logs_db.mybucket_logs
STRING
andBIGINT
data type values are the access log properties. You can query these properties in Athena. ForLOCATION
, enter the S3 bucket and prefix path as noted earlier.CREATE EXTERNAL TABLE `
s3_access_logs_db.mybucket_logs
`( `bucketowner` STRING, `bucket_name` STRING, `requestdatetime` STRING, `remoteip` STRING, `requester` STRING, `requestid` STRING, `operation` STRING, `key` STRING, `request_uri` STRING, `httpstatus` STRING, `errorcode` STRING, `bytessent` BIGINT, `objectsize` BIGINT, `totaltime` STRING, `turnaroundtime` STRING, `referrer` STRING, `useragent` STRING, `versionid` STRING, `hostid` STRING, `sigv` STRING, `ciphersuite` STRING, `authtype` STRING, `endpoint` STRING, `tlsversion` STRING, `accesspointarn` STRING, `aclrequired` STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe' WITH SERDEPROPERTIES ( 'input.regex'='([^ ]*) ([^ ]*) \\[(.*?)\\] ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) (\"[^\"]*\"|-) (-|[0-9]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) (\"[^\"]*\"|-) ([^ ]*)(?: ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*))?.*$') STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 's3://amzn-s3-demo-bucket1-logs/prefix/
' -
In the navigation pane, under Database, choose your database.
-
Under Tables, choose Preview table next to your table name.
In the Results pane, you should see data from the server access logs, such as
bucketowner
,bucket
,requestdatetime
, and so on. This means that you successfully created the Athena table. You can now query the Amazon S3 server access logs.
Example — Show who deleted an object and when (timestamp, IP address, and IAM user)
SELECT requestdatetime, remoteip, requester, key FROM
s3_access_logs_db.mybucket_logs
WHERE key = 'images/picture.jpg
' AND operation like '%DELETE%';
Example — Show all operations that were performed by an IAM user
SELECT * FROM
s3_access_logs_db.mybucket_logs
WHERE requester='arn:aws-cn:iam::123456789123
:user/user_name
';
Example — Show all operations that were performed on an object in a specific time period
SELECT * FROM
s3_access_logs_db.mybucket_logs
WHERE Key='prefix/images/picture.jpg
' AND parse_datetime(requestdatetime,'dd/MMM/yyyy:HH:mm:ss Z') BETWEEN parse_datetime('2017-02-18:07:00:00
','yyyy-MM-dd:HH:mm:ss') AND parse_datetime('2017-02-18:08:00:00
','yyyy-MM-dd:HH:mm:ss');
Example — Show how much data was transferred to a specific IP address in a specific time period
SELECT coalesce(SUM(bytessent), 0) AS bytessenttotal FROM
s3_access_logs_db.mybucket_logs
WHERE remoteip='192.0.2.1' AND parse_datetime(requestdatetime,'dd/MMM/yyyy:HH:mm:ss Z') BETWEEN parse_datetime('2022-06-01
','yyyy-MM-dd') AND parse_datetime('2022-07-01
','yyyy-MM-dd');
Note
To reduce the time that you retain your logs, you can create an S3 Lifecycle configuration for your server access logs bucket. Create lifecycle configuration rules to remove log files periodically. Doing so reduces the amount of data that Athena analyzes for each query. For more information, see Setting an S3 Lifecycle configuration on a bucket.
Identifying Signature Version 2 requests by using Amazon S3 access logs
Amazon S3 support for Signature Version 2 will be turned off (deprecated). After that, Amazon S3 will no longer accept requests that use Signature Version 2, and all requests must use Signature Version 4 signing. You can identify Signature Version 2 access requests by using Amazon S3 access logs.
Note
To identify Signature Version 2 requests, we recommend that you use Amazon CloudTrail data events instead of Amazon S3 server access logs. CloudTrail data events are easier to set up and contain more information than server access logs. For more information, see Identifying Amazon S3 Signature Version 2 requests by using CloudTrail.
Example — Show all requesters that are sending Signature Version 2 traffic
SELECT requester, sigv, Count(sigv) as sigcount FROM
s3_access_logs_db.mybucket_logs
GROUP BY requester, sigv;
Identifying object access requests by using Amazon S3 access logs
You can use queries on Amazon S3 server access logs to identify Amazon S3 object access requests,
for operations such as GET
, PUT
, and DELETE
, and
discover further information about those requests.
The following Amazon Athena query example shows how to get all PUT
object
requests for Amazon S3 from a server access log.
Example — Show all requesters that are sending PUT
object requests in a
certain period
SELECT bucket_name, requester, remoteip, key, httpstatus, errorcode, requestdatetime FROM
s3_access_logs_db
WHERE operation='REST.PUT.OBJECT' AND parse_datetime(requestdatetime,'dd/MMM/yyyy:HH:mm:ss Z') BETWEEN parse_datetime('2019-07-01:00:42:42
','yyyy-MM-dd:HH:mm:ss') AND parse_datetime('2019-07-02:00:42:42'
,'yyyy-MM-dd:HH:mm:ss')
The following Amazon Athena query example shows how to get all GET
object
requests for Amazon S3 from the server access log.
Example — Show all requesters that are sending GET
object requests in a
certain period
SELECT bucket_name, requester, remoteip, key, httpstatus, errorcode, requestdatetime FROM
s3_access_logs_db
WHERE operation='REST.GET.OBJECT' AND parse_datetime(requestdatetime,'dd/MMM/yyyy:HH:mm:ss Z') BETWEEN parse_datetime('2019-07-01:00:42:42
','yyyy-MM-dd:HH:mm:ss') AND parse_datetime('2019-07-02:00:42:42'
,'yyyy-MM-dd:HH:mm:ss')
The following Amazon Athena query example shows how to get all anonymous requests to your S3 buckets from the server access log.
Example — Show all anonymous requesters that are making requests to a bucket during a certain period
SELECT bucket_name, requester, remoteip, key, httpstatus, errorcode, requestdatetime FROM
s3_access_logs_db.mybucket_logs
WHERE requester IS NULL AND parse_datetime(requestdatetime,'dd/MMM/yyyy:HH:mm:ss Z') BETWEEN parse_datetime('2019-07-01:00:42:42
','yyyy-MM-dd:HH:mm:ss') AND parse_datetime('2019-07-02:00:42:42'
,'yyyy-MM-dd:HH:mm:ss')
The following Amazon Athena query shows how to identify all requests to your S3 buckets that required an access control list (ACL) for authorization. You can use this information to migrate those ACL permissions to the appropriate bucket policies and disable ACLs. After you've created these bucket policies, you can disable ACLs for these buckets. For more information about disabling ACLs, see Prerequisites for disabling ACLs.
Example — Identify all requests that required an ACL for authorization
SELECT bucket_name, requester, key, operation, aclrequired, requestdatetime FROM
s3_access_logs_db
WHERE aclrequired = 'Yes' AND parse_datetime(requestdatetime,'dd/MMM/yyyy:HH:mm:ss Z') BETWEEN parse_datetime('2022-05-10:00:00:00
','yyyy-MM-dd:HH:mm:ss') AND parse_datetime('2022-08-10:00:00:00'
,'yyyy-MM-dd:HH:mm:ss')
Note
-
You can modify the date range as needed to suit your needs.
-
These query examples might also be useful for security monitoring. You can review the results for
PutObject
orGetObject
calls from unexpected or unauthorized IP addresses or requesters and for identifying any anonymous requests to your buckets. -
This query only retrieves information from the time at which logging was enabled.
-
If you are using Amazon CloudTrail logs, see Identifying access to S3 objects by using CloudTrail.