Use Amazon S3 ARNs instead of passing large payloads
Executions that pass large payloads of data between states can be terminated. If the data you are passing between states might grow to over 262,144 bytes, use Amazon Simple Storage Service (Amazon S3) to store
the data, and parse the Amazon Resource Name (ARN) of the bucket in the Payload
parameter to get the bucket name and key value. Alternatively, adjust your implementation so that you pass
smaller payloads in your executions.
In the following example, a state machine passes input to an Amazon Lambda function, which processes a JSON file in an Amazon S3 bucket. After you run this state machine, the Lambda function reads the contents of the JSON file, and returns the file contents as output.
Create the Lambda function
The following Lambda function named
reads the contents of a JSON file stored in a specific Amazon S3 bucket.pass-large-payload
Note
After you create this Lambda function, make sure you provide its IAM role the appropriate permission to read from an Amazon S3 bucket. For example, attach the AmazonS3ReadOnlyAccess permission to the Lambda function's role.
import json import boto3 import io import os s3 = boto3.client('s3') def lambda_handler(event, context): event = event['Input'] final_json = str() s3 = boto3.resource('s3') bucket = event['bucket'].split(':')[-1] filename = event['key'] directory = "/tmp/{}".format(filename) s3.Bucket(bucket).download_file(filename, directory) with open(directory, "r") as jsonfile: final_json = json.load(jsonfile) os.popen("rm -rf /tmp") return final_json
Create the state machine
The following state machine invokes the Lambda function you previously created.
{ "StartAt":"Invoke Lambda function", "States":{ "Invoke Lambda function":{ "Type":"Task", "Resource":"arn:aws:states:::lambda:invoke", "Parameters":{ "FunctionName":"arn:aws-cn:lambda:us-west-2:123456789012:function:
pass-large-payload
", "Payload":{ "Input.$":"$" } }, "OutputPath": "$.Payload", "End":true } } }
Rather than pass a large amount of data in the input, you could save that data in an Amazon S3 bucket, and pass the Amazon Resource Name (ARN) of the bucket in the Payload
parameter to
get the bucket name and key value. Your Lambda function can then use that ARN to access the data directly. The following is example input for the state machine execution, where the data is
stored in data.json
in an Amazon S3 bucket named
.large-payload-json
{
"key": "data.json",
"bucket": "arn:aws-cn:s3:::large-payload-json
"
}