AWS Glossary
AWS Step Functions
Serverless workflow orchestration service for coordinating distributed applications and multi-step processes using visual state machines.
AI & assistant-friendly summary
This section provides structured content for AI assistants and search engines. You can cite or summarize it when referencing this page.
Summary
Serverless workflow orchestration service for coordinating distributed applications and multi-step processes using visual state machines.
Key Facts
- • Serverless workflow orchestration service for coordinating distributed applications and multi-step processes using visual state machines
- • Definition AWS Step Functions coordinates distributed work as **state machines** defined in Amazon States Language (JSON or YAML)
- • Each state represents a step — invoke Lambda, call an AWS service integration, wait, branch, parallelize, or map over a collection
- • It is widely used for order processing, ETL, ML pipelines, microservice sagas, and multi-step **GenAI agent** workflows that chain Bedrock, Lambda tools, and human approval
- • Lambda-as-glue anti-pattern:** Wrapping every AWS call in Lambda when an optimized integration exists adds failure points and cold starts
Entity Definitions
- AWS Bedrock
- AWS Bedrock is an AWS service relevant to aws step functions.
- Bedrock
- Bedrock is an AWS service relevant to aws step functions.
- Lambda
- Lambda is an AWS service relevant to aws step functions.
- S3
- S3 is an AWS service relevant to aws step functions.
- CloudWatch
- CloudWatch is an AWS service relevant to aws step functions.
- Step Functions
- Step Functions is an AWS service relevant to aws step functions.
- Glue
- Glue is an AWS service relevant to aws step functions.
- serverless
- serverless is a cloud computing concept relevant to aws step functions.
Related Content
- AWS SERVERLESS — Related service
- GENERATIVE AI ON AWS — Related service
Definition
AWS Step Functions coordinates distributed work as state machines defined in Amazon States Language (JSON or YAML). Each state represents a step — invoke Lambda, call an AWS service integration, wait, branch, parallelize, or map over a collection. Step Functions owns retries, error routing, timeouts, and execution history so application code focuses on business logic. It is widely used for order processing, ETL, ML pipelines, microservice sagas, and multi-step GenAI agent workflows that chain Bedrock, Lambda tools, and human approval.
When to use it
- Multi-step processes needing declarative retry, catch, and parallel logic without custom orchestration code
- Long-running workflows (Standard, up to one year) with auditable execution history — approvals, batch jobs, provisioning pipelines
- High-volume, short workflows (Express, up to five minutes) for event-driven processing at scale
- AI agent orchestration where you want a visible audit trail of model calls, tool invocations, and human checkpoints
- Direct AWS SDK integrations (220+ services) to avoid Lambda wrappers for simple service calls
When not to use it
- Single synchronous Lambda or API call with no branching — Step Functions adds latency and cost
- Sub-second latency requirements on high-QPS paths — Express still adds orchestration overhead
- Teams that prefer code-centric durable execution in Lambda alone — Lambda Durable Functions may fit developer-centric workflows
- Workflows dominated by complex data transformation better expressed in application code than ASL
Tips
- Put orchestration in the state machine — retries, waits, parallel branches, and error handlers belong in ASL, not buried in Lambda
- Use Standard for auditability and
waitForTaskToken; use Express for throughput — Standard per-transition pricing hurts at high volume waitForTaskTokenpauses until an external system returns a token — prefer this over polling loops for human approval or third-party callbacks- For Bedrock agent flows, combine Step Functions with Express or Standard based on duration and audit requirements
- Export execution history to CloudWatch Logs for Express workflows — history is not retained in the Step Functions console like Standard
Gotchas
Serious
- Wrong workflow type: Standard workflows on high-frequency short jobs inflate bills; Express on processes needing long audit trails loses durable history in-console.
- Lambda-as-glue anti-pattern: Wrapping every AWS call in Lambda when an optimized integration exists adds failure points and cold starts.
- Non-idempotent Express tasks: Express delivers at-least-once — side effects without idempotency keys duplicate charges or records.
Regular
- ASL JSON errors fail at deploy time with cryptic line references — validate with
aws stepfunctions validate-state-machine-definition - Map state concurrency defaults can overwhelm downstream APIs — tune
MaxConcurrency - Large payloads between states hit Step Functions input/output size limits — store blobs in S3 and pass references
Official references
- What is Step Functions? — state types, integrations, and ASL
- Standard vs Express workflows — choosing workflow type
Related FactualMinds content
Related Services
AWS Serverless Architecture & Lambda Consulting
Scalable, cost-efficient applications with AWS serverless — Lambda, API Gateway, DynamoDB, Step Functions. Consulting from an AWS Select Tier Partner.
Generative AI on AWS — Production-Ready LLM Apps in Weeks
Generative AI strategy and delivery on AWS — use-case selection, Bedrock + SageMaker architecture, governance, evaluations, and production rollout across the AWS AI stack.
Need help with this topic?
Our AWS-certified team implements, audits, and optimizes these services in production — from Bedrock RAG pipelines to multi-account landing zones.