Executors Configuration
Dagy uses **runtime tiers** as the primary user-facing concept for configuring execution environments. Each runtime tier is backed by one of three execution backends: **Lambda**, **Step Functions**, and **ECS Fargate**. These are selected automatically based on your tier choice and flow requirements. The backend router handles the internal infrastructure mapping, ensuring your flows run with the right amount of compute resources.
Runtime Tiers
Dagy provides six runtime tiers designed for different workload characteristics. Each tier automatically selects the appropriate backend infrastructure:
| Tier | Description | Max Duration | Resources | Best For |
|---|---|---|---|---|
| nano | All tasks in one invocation | 10 minutes | Shared infrastructure | Development, testing, minimal workloads |
| micro | Each task runs independently | 10 minutes per task | Shared infrastructure | Simple DAGs, short-lived pipelines |
| small | Dedicated compute | Unlimited | 0.5 vCPU / 1 GB RAM | Small workloads, moderate complexity |
| medium | Dedicated compute | Unlimited | 1 vCPU / 2 GB RAM | General-purpose workflows |
| large | Dedicated compute | Unlimited | 2 vCPU / 4 GB RAM | Resource-intensive tasks, long pipelines |
| xlarge | Dedicated compute | Unlimited | 4 vCPU / 12 GB RAM | High-compute workloads, large datasets |
Execution Backends (Internal Implementation)
Dagy uses three execution backends internally to support the runtime tiers above. These backends are implementation details managed automatically by the framework and are documented here for infrastructure operators.
Lambda (Default)
Backend for: nano, micro tiers
Executes flows synchronously within the current Lambda invocation using the local executor with ThreadPoolExecutor-based timeouts.
Executes flows synchronously within the current Lambda invocation using the local executor with ThreadPoolExecutor-based timeouts.
| Property | Value |
|---|---|
| Max duration | 900 seconds (15 minutes) |
| Max memory | 10,240 MB |
| Parallel tasks | No (sequential execution) |
| Native retry | No (handled by SDK LocalExecutor) |
| Cancellation | Yes (checked between tasks) |
| Cost | ~$0.06/hr at 1 GB |
Best for: Short-lived pipelines, simple DAGs, development/testing.
Lambda is always available as the default backend. No additional configuration is required.
Step Functions
Backend for: Reserved for complex orchestration scenarios (not directly exposed via runtime tiers)
Translates FlowSpec DAGs into Amazon States Language (ASL) state machines with native parallel execution and retry support.
| Property | Value |
|---|---|
| Max duration | 1 year (Standard workflows) |
| Max states | 25,000 |
| Parallel tasks | Yes (native Parallel state) |
| Native retry | Yes (ASL Retry policies) |
| Cancellation | Yes (stop_execution()) |
| Cost | Per state transition |
Best for: Long-running pipelines (15 min – 1 hr), complex DAGs with many parallel tasks.
Configuration:
| Environment Variable | Description | Required |
|---|---|---|
DAGY_SFN_ROLE_ARN | IAM role ARN for Step Functions execution | Yes (enables backend) |
DAGY_SFN_TASK_EXECUTOR_ARN | Lambda ARN that executes individual tasks | Yes |
The Step Functions backend is registered automatically when DAGY_SFN_ROLE_ARN is set.
ASL translation details:
- Sequential dependencies become
Nexttransitions - Independent tasks at the same dependency level become
Parallelbranches retriesmaps to ASLRetrywithMaxAttemptsretry_delay_secondsmaps toIntervalSeconds(integer only, fractional seconds are truncated)retry_jitter_factorsetsBackoffRate: 2.0(exponential backoff); without jitter,BackoffRate: 1.0timeout_secondsmaps to ASLTimeoutSeconds- All errors are caught via
Catchblocks that route to aHandleTaskFailurestate
Note: List-based and callable delay strategies are not fully translated to ASL. Only the first interval value is used as a fixed
IntervalSeconds.
ECS Fargate
Backend for: small, medium, large, xlarge tiers
Runs DAG flows as ECS Fargate tasks for resource-intensive or very long-running workloads.
| Property | Value |
|---|---|
| Max duration | Unlimited |
| Max memory | 120 GB |
| Max vCPU | 16 |
| Parallel tasks | No (managed by Dagy orchestrator) |
| Native retry | No (orchestrator handles) |
| Cancellation | Yes (stop_task()) |
| Streaming logs | Yes (CloudWatch Logs) |
| Cost | ~$0.045/hr at 0.25 vCPU, 0.5 GB |
Best for: Long-running workloads (> 1 hr), resource-intensive tasks needing > 10 GB memory.
Configuration:
| Environment Variable | Description | Required |
|---|---|---|
DAGY_ECS_CLUSTER_ARN | ECS cluster ARN | Yes (enables backend) |
DAGY_ECS_EXECUTION_ROLE_ARN | Task execution role (for pulling images, logs) | No |
DAGY_ECS_TASK_ROLE_ARN | Task role (permissions for your code) | No |
DAGY_ECS_WORKER_IMAGE | Docker image for the worker | No (default: dagy-worker:latest) |
DAGY_ECS_SUBNETS | Comma-separated subnet IDs | No |
DAGY_ECS_SECURITY_GROUPS | Comma-separated security group IDs | No |
DAGY_ECS_LOG_GROUP | CloudWatch log group | No (default: /ecs/dagy-worker) |
The ECS backend is registered automatically when DAGY_ECS_CLUSTER_ARN is set.
Runtime Tier to Resource Mapping:
| Runtime Tier | vCPU | Memory |
|---|---|---|
| small | 0.5 | 1 GB |
| medium | 1 | 2 GB |
| large | 2 | 4 GB |
| xlarge | 4 | 12 GB |
Valid Fargate CPU/memory combinations (for custom configurations):
| vCPU | Memory Options |
|---|---|
| 0.25 | 512 MB, 1 GB, 2 GB |
| 0.5 | 1–4 GB |
| 1 | 2–8 GB |
| 2 | 4–16 GB |
| 4 | 8–30 GB |
| 8 | 16–60 GB |
| 16 | 32–120 GB |
Runtime Tier Selection
The runtime tier is determined by a 6-level resolution strategy. The tier you select automatically maps to the appropriate execution backend:
- Explicit request: per-run
runtime_tierfield in the API trigger - Deployment config:
default_runtime_tieron the deployment - Environment config:
default_runtime_tieron the environment - Flow config:
runtime_tieron the@taskdecorator - Automatic selection rules: based on flow properties
- Default: nano
Automatic Tier Selection Rules
When no explicit runtime tier is specified, the router evaluates these rules in priority order:
| Priority | Rule | Condition | Selected Tier |
|---|---|---|---|
| 100 | Resource | Memory ≥ 3 GB | xlarge |
| 90 | Duration | Max timeout > 1 hr | large |
| 80 | Duration | Max timeout 15 min – 1 hr | medium |
| 70 | Complexity | Task count ≥ 50 | large |
| - | Default | Everything else | micro |
Per-Task Runtime Tier
You can specify a preferred runtime tier on individual tasks:
from dagy import task
@task(runtime_tier="large")
def memory_intensive(data: list) -> dict:
# This task runs on the large tier (2 vCPU / 4 GB RAM)
...
The runtime_tier field is a hint; the backend router considers it alongside deployment and environment configuration.
Backend Capabilities Reference
This reference is for infrastructure operators and developers integrating with Dagy's backend systems directly.
| Capability | Lambda | Step Functions | ECS Fargate |
|---|---|---|---|
max_duration_seconds | 900 | 31,536,000 | Unlimited |
max_memory_mb | 10,240 | Depends on compute | 122,880 |
max_tasks | Unlimited | 25,000 | Unlimited |
supports_parallel | No | Yes | No |
supports_native_retry | No | Yes | No |
supports_cancel | Yes | Yes | Yes |
supports_streaming_logs | No | No | Yes |