Using DynamoDB as a checkpoint store for LangGraph agents - Amazon DynamoDB
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Using DynamoDB as a checkpoint store for LangGraph agents

LangGraph is a framework for building stateful, multi-actor AI applications with Large Language Models (LLMs). LangGraph agents require persistent storage to maintain conversation state, enable human-in-the-loop workflows, support fault tolerance, and provide time-travel debugging capabilities. DynamoDB's serverless architecture, single-digit millisecond latency, and automatic scaling make it an ideal checkpoint store for production LangGraph deployments on Amazon.

The langgraph-checkpoint-aws package provides a DynamoDBSaver class that implements the LangGraph checkpoint interface, enabling you to persist agent state in DynamoDB with optional Amazon Simple Storage Service offloading for large checkpoints.

Key features

State persistence

Automatically saves agent state after each step, enabling agents to resume from interruptions and recover from failures.

Time to Live-based cleanup

Automatically expire old checkpoints using DynamoDB Time to Live to manage storage costs.

Compression

Optionally compress checkpoint data with gzip to reduce storage costs and improve throughput.

Amazon S3 offloading

Automatically offload large checkpoints (greater than 350 KB) to Amazon Simple Storage Service to work within DynamoDB item size limits.

Sync and async support

Both synchronous and asynchronous APIs for flexibility in different application architectures.

Prerequisites

  • Python 3.10 or later

  • An Amazon Web Services account with permissions to create DynamoDB tables (and optionally Amazon S3 buckets)

  • Amazon credentials configured (see the Amazon documentation for credential setup options)

Important

This guide creates Amazon resources that may incur charges. DynamoDB uses pay-per-request billing by default, and Amazon S3 charges apply if you enable large checkpoint offloading. Follow the Clean up section to delete resources when you are done.

Installation

Install the checkpoint package from PyPI:

pip install langgraph-checkpoint-aws

Basic usage

The following example demonstrates how to configure DynamoDB as a checkpoint store for a LangGraph agent:

from langgraph.graph import StateGraph from langgraph_checkpoint_aws import DynamoDBSaver from typing import TypedDict # Define your state schema class State(TypedDict): input: str result: str # Initialize the DynamoDB checkpoint saver checkpointer = DynamoDBSaver( table_name="langgraph-checkpoints", region_name="us-east-1" ) # Build your LangGraph workflow builder = StateGraph(State) builder.add_node("process", lambda state: {"result": "processed"}) builder.set_entry_point("process") builder.set_finish_point("process") # Compile the graph with the DynamoDB checkpointer graph = builder.compile(checkpointer=checkpointer) # Invoke the graph with a thread ID to enable state persistence config = {"configurable": {"thread_id": "session-123"}} result = graph.invoke({"input": "data"}, config)

The thread_id in the configuration acts as the partition key in DynamoDB, allowing you to maintain separate conversation threads and retrieve historical states for any thread.

Production configuration

For production deployments, you can enable Time to Live, compression, and Amazon S3 offloading. You can also use the endpoint_url parameter to point to a local DynamoDB instance for testing:

import boto3 from botocore.config import Config from langgraph_checkpoint_aws import DynamoDBSaver # Production configuration session = boto3.Session( profile_name="production", region_name="us-east-1" ) checkpointer = DynamoDBSaver( table_name="langgraph-checkpoints", session=session, ttl_seconds=86400 * 7, # Expire checkpoints after 7 days enable_checkpoint_compression=True, # Enable gzip compression boto_config=Config( retries={"mode": "adaptive", "max_attempts": 6}, max_pool_connections=50 ), s3_offload_config={ "bucket_name": "my-checkpoint-bucket" } ) # Local testing with DynamoDB Local local_checkpointer = DynamoDBSaver( table_name="langgraph-checkpoints", region_name="us-east-1", endpoint_url="http://localhost:8000" )

DynamoDB table configuration

The checkpoint saver requires a DynamoDB table with a composite primary key. You can create the table using the following Amazon CloudFormation template:

AWSTemplateFormatVersion: '2010-09-09' Description: 'DynamoDB table for LangGraph checkpoint storage' Parameters: TableName: Type: String Default: langgraph-checkpoints Resources: CheckpointTable: Type: AWS::DynamoDB::Table DeletionPolicy: Retain UpdateReplacePolicy: Retain Properties: TableName: !Ref TableName BillingMode: PAY_PER_REQUEST AttributeDefinitions: - AttributeName: PK AttributeType: S - AttributeName: SK AttributeType: S KeySchema: - AttributeName: PK KeyType: HASH - AttributeName: SK KeyType: RANGE TimeToLiveSpecification: AttributeName: ttl Enabled: true PointInTimeRecoverySpecification: PointInTimeRecoveryEnabled: true SSESpecification: SSEEnabled: true

Deploy the template with the Amazon CLI:

aws cloudformation deploy \ --template-file template.yaml \ --stack-name langgraph-checkpoint \ --parameter-overrides TableName=langgraph-checkpoints

Required IAM permissions

The following IAM policy provides the minimum permissions required for the DynamoDB checkpoint saver. Replace 111122223333 with your Amazon Web Services account ID and update the Region to match your environment.

{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "dynamodb:GetItem", "dynamodb:PutItem", "dynamodb:Query", "dynamodb:BatchGetItem", "dynamodb:BatchWriteItem" ], "Resource": "arn:aws:dynamodb:us-east-1:111122223333:table/langgraph-checkpoints" } ] }

If you enable Amazon S3 offloading, add the following statement to the policy:

{ "Effect": "Allow", "Action": [ "s3:PutObject", "s3:GetObject", "s3:DeleteObject", "s3:PutObjectTagging" ], "Resource": "arn:aws:s3:::my-checkpoint-bucket/*" }, { "Effect": "Allow", "Action": [ "s3:GetBucketLifecycleConfiguration", "s3:PutBucketLifecycleConfiguration" ], "Resource": "arn:aws:s3:::my-checkpoint-bucket" }

Asynchronous usage

For asynchronous applications, use the async methods provided by the checkpoint saver:

import asyncio from langgraph.graph import StateGraph from langgraph_checkpoint_aws import DynamoDBSaver from typing import TypedDict class State(TypedDict): input: str result: str async def main(): checkpointer = DynamoDBSaver( table_name="langgraph-checkpoints", region_name="us-east-1" ) builder = StateGraph(State) builder.add_node("process", lambda state: {"result": "processed"}) builder.set_entry_point("process") builder.set_finish_point("process") graph = builder.compile(checkpointer=checkpointer) config = {"configurable": {"thread_id": "async-session-123"}} result = await graph.ainvoke({"input": "data"}, config) return result asyncio.run(main())

Clean up

To avoid ongoing charges, delete the resources you created:

# Delete the DynamoDB table aws dynamodb delete-table --table-name langgraph-checkpoints # Delete the CloudFormation stack (if you used the template above) aws cloudformation delete-stack --stack-name langgraph-checkpoint # If you created an S3 bucket for large checkpoint offloading, empty and delete it aws s3 rm s3://my-checkpoint-bucket --recursive aws s3 rb s3://my-checkpoint-bucket

Error handling

Common error scenarios:

  • Table not found: Verify the table_name and region_name match your DynamoDB table.

  • Throttling: If you see ProvisionedThroughputExceededException, consider switching to on-demand billing mode or increasing provisioned capacity.

  • Item size exceeded: If checkpoints exceed 350 KB, enable Amazon S3 offloading (see Production configuration).

  • Credential errors: Verify your Amazon credentials are valid and have the required permissions.

Additional resources