4 min.

Workflow orchestration on AWS

An overview of orchestrations workflow and associated services on AWS

A workflow is a defined process or sequence of tasks that leads to the completion of a particular job. Typically, workflows are used to make jobs more predictable and efficient. Workflow orchestration refers to the coordination and management of these tasks, focusing on who is responsible for managing each step towards the end product.


In essence, orchestration involves managerial aspects, overseeing the execution of the workflow's components. These components are responsible for carrying out specific tasks that contribute to the larger goal of the workflow.


On AWS, the primary services for orchestration are AWS Step Functions and Amazon Simple Workflow Service (SWF). Additionally, there are more specialized services for workflow orchestration tailored to specific use cases, such as AWS Batch for batch computing workflows, AWS Data Pipeline for data processing workflows, and AWS Glue for data integration and ETL (Extract, Transform, Load) workflows.



AWS Step Functions


AWS Step Functions is a serverless orchestration service for building and coordinating AWS Lambda functions and other AWS services in business applications. It uses state machines and tasks to represent workflows, visualizing each step in an application's workflow through its graphical console.


Key Aspects:


State Machines


Workflows, termed as state machines, comprise states (steps) performing tasks, making choices, and passing parameters.


Task State: Performs specific tasks.


Choice State: Decides between different execution branches.


Fail or Succeed State: Terminates the execution, indicating either failure or success.


Pass State: Transfers input directly to output or introduces fixed data into the workflow.


Wait State: Delays the process for a set period or until a certain date and time.


Parallel State: Initiates concurrent execution branches.


Map State: Enables dynamic iteration over steps.



Graphical Workflow Design


The visual console aids in designing these state machines, streamlining the creation of multi-step applications.



AWS Service Integration


Tasks within states can invoke various AWS services like Lambda, Amazon ECS / AWS Fargate, Amazon EKS, Amazon SNS, APiIGateway, Amazon SQS, AWS Glue, Amazon EventBridge, CodeBuild, AWS Batch or DynamoDB.



Third-party integration


Step Functions seamlessly integrates with various AWS services but also is capable of invoking any HTTP endpoint



Error Handling


Built-in controls for error handling and retries ensure smooth application execution, complemented by detailed logging for debugging.



Workflow Types:


Standard Workflows


Suited for long-running tasks, offering detailed execution history and visual debugging. They run up to one year, ensuring each step executes exactly once.



Express Workflows


Designed for high-event-rate applications (e.g., IoT data ingestion), running up to five minutes with an at-least-once execution guarantee.



Use Cases:


Function Orchestration


Chain Lambda functions where the output of one is the input to the next.





Manage decision-based workflows, like credit limit increase requests.



Error Handling


Implement Retry/Catch patterns for robust error management.



Human in the Loop


Integrate human interaction, such as confirming transactions in a banking app.



Parallel Processing


Concurrently process tasks, like video file conversion into multiple formats.



Dynamic Parallelism


Handle parallel tasks dynamically, such as processing customer orders.


Service Integrations types:


Step Functions integrates with AWS services using patterns like Request-Response, Run-a-job (.sync), and Wait-for-a-callback (.waitForTaskToken), facilitating diverse application needs.


Request-Response (Default):


Invoke an AWS service and allow Step Functions to proceed to the next state once an HTTP response is received.


Run a Job (.sync)


Trigger an AWS service to execute a job, with Step Functions pausing until the job's completion is confirmed.



Wait for Callback with Task Token (.waitForTaskToken)


Initiate a service call that includes a task token, instructing Step Functions to wait for the return of this token via a callback before proceeding.



Amazon Simple Workflow Service (SWF)


Amazon Simple Workflow Service (SWF) simplifies the coordination and state management of distributed application components. It enables the scheduling and integration of tasks across various infrastructures, such as EC2 and Lambda, facilitating the development of distributed applications. By reliably tracking the execution state and task statuses, SWF enhances application resilience, aiding in recovery from component failures. This web service supports the design, execution, and management of workflows, accommodating both automated processes and human-led tasks.


Key Aspects:


Task Coordination


SWF provides a framework for building applications that coordinate tasks across distributed systems, ensuring tasks are executed in order and without overlap.



Worker and Decider Roles


The service distinguishes between "workers" (components that perform tasks) and "deciders" (components that control the flow of activity based on task outcomes).



Durable Execution State


SWF maintains the state of your workflow execution, including task completion status and decision points, ensuring reliability and fault tolerance.



Domain Scoping


Workflows and related activities are scoped within domains, which act as containers to isolate and manage related sets of workflows.





Being an AWS managed service, SWF can scale to support the execution of numerous workflows, handling the orchestration of complex processes seamlessly.



History Tracking


Maintains a complete history of workflow executions, which is critical for debugging and auditing purposes.



Use Cases:


Business Process Workflows


Automating business processes that involve a series of steps, some of which may require human intervention, such as order fulfillment or employee onboarding.



Data Processing Pipelines


Coordinating complex data processing tasks that involve multiple stages of processing, validation, and analysis.



Media Processing


Orchestrating workflows for processing media files, including transcoding, watermarking, and distribution.



E-commerce Applications


Managing complex e-commerce transactions, including inventory management, order processing, and customer notifications.



Scientific Simulations


Coordinating components of distributed scientific simulations that require orchestration of compute-intensive tasks.



SWF does not provide direct, built-in integrations with AWS services in the same way AWS Step Functions does, it can be integrated with other AWS services using custom application logic. Common integrations include:


Amazon EC2: Running workers and deciders on EC2 instances allows for powerful compute resources to perform tasks.


AWS Lambda: Invoking Lambda functions from within workflow tasks for serverless task execution.


Amazon S3: Storing inputs and outputs of tasks, such as files that need to be processed or are generated by the workflow.


Amazon DynamoDB: Maintaining application state or managing task-related data.


Amazon Simple Notification Service (SNS) and Amazon Simple Queue Service (SQS): Sending notifications or messages as part of the workflow process.


Integrating SWF with other AWS services generally requires the use of the AWS SDK within your application code. This allows you to leverage the full range of AWS capabilities within your SWF workflows, albeit with more development effort compared to the more declarative integrations available in AWS Step Functions.



AWS Step Functions vs Amazon SWF


AWS Step Functions and Amazon Simple Workflow Service (SWF) serve different orchestration needs within the AWS ecosystem, distinguished by their approach to workflow management, integration, and use case suitability. In essence, the choice between Step Functions and SWF hinges on the specific needs of your application, balancing between ease of use and integration capabilities of Step Functions against the customizability and control offered by SWF.


Workflow Design


Step Functions allows for state machine definitions in JSON with a visual designer, simplifying workflow creation. In contrast, SWF requires writing custom decider programs for task coordination.





Step Functions is a serverless service, eliminating the need for infrastructure management, unlike SWF, where manual management of resources like clusters is necessary.



Ease of Use


Step Functions offers a user-friendly, visual approach, making it more accessible for beginners and suitable for straightforward applications. SWF provides detailed control through custom logic, catering to complex workflows but with added complexity.





Step Functions supports seamless, native integration with AWS services such as Lambda and DynamoDB. SWF, however, demands custom code for integrating with AWS services, offering flexibility at the expense of extra development work.



Use cases


Step Functions is ideal for serverless applications requiring close integration with AWS services. SWF is preferred for scenarios needing advanced control over workflows, such as those requiring external interventions or complex, custom logic.


