Map workflow state
Use the Map
state to run a set of workflow steps for each item in a
dataset. The Map
state's iterations run in parallel, which makes it possible to
process a dataset quickly. Map
states can use a variety of input types,
including a JSON array, a list of Amazon S3 objects, or a CSV file.
Step Functions provides two types of processing modes for using the Map
state in your
workflows: Inline mode and Distributed
mode.
Tip
To deploy an example of a workflow that uses a Map
state to your Amazon Web Services account, see Module 5 - Choice State and Map State
Map state processing modes
Step Functions provides the following processing modes for the Map
state depending on how you want to process the items in a dataset.
-
Inline – Limited-concurrency mode. In this mode, each iteration of the
Map
state runs in the context of the workflow that contains theMap
state. Step Functions adds the execution history of these iterations to the parent workflow's execution history. By default,Map
states run in Inline mode.In this mode, the
Map
state accepts only a JSON array as input. Also, this mode supports up to 40 concurrent iterations.For more information, see Using Map state in Inline mode in Step Functions workflows.
-
Distributed – High-concurrency mode. In this mode, the
Map
state runs each iteration as a child workflow execution, which enables high concurrency of up to 10,000 parallel child workflow executions. Each child workflow execution has its own, separate execution history from that of the parent workflow.In this mode, the
Map
state can accept either a JSON array or an Amazon S3 data source, such as a CSV file, as its input.For more information, see Distributed mode.
The mode you should use depends on how you want to process the
items in a dataset. Use the Map
state in Inline mode if your workflow's execution history won't exceed 25,000 entries, or if you don't require more than 40 concurrent iterations.
Use the Map
state in Distributed mode when you need to orchestrate large-scale parallel workloads that meet any combination of the following conditions:
The size of your dataset exceeds 256 KB.
The workflow's execution event history exceeds 25,000 entries.
You need a concurrency of more than 40 parallel iterations.
Inline mode and Distributed mode differences
The following table highlights the differences between the Inline and Distributed modes.
Inline mode | Distributed mode |
---|---|
Supported data sources | |
Accepts a JSON array passed from a previous step in the workflow as input. |
Accepts the following data sources as input:
|
Map iterations | |
In this mode, each iteration of the |
In this mode, the |
Maximum concurrency for parallel iterations | |
Lets you run up to 40 iterations as concurrently as possible. |
Lets you run up to 10,000 parallel child workflow executions to process millions of data items at one time. |
Input payload and event history sizes | |
Enforces a limit of 256 KB on the input payload size and 25,000 entries in the execution event history. |
Lets you overcome the payload size limitation because the In this mode, you can also overcome execution history limitations
because the child workflow executions started by the |
Monitoring and observability | |
You can review the workflow's execution history from the console or by
invoking the You can also view the execution history through CloudWatch and X-Ray. |
When you run a For more information, see Viewing a Distributed Map Run execution in Step Functions. |