Configuration

Executors Configuration

Dagy uses **runtime tiers** as the primary user-facing concept for configuring execution environments. Each runtime tier is backed by one of three execution backends: **Lambda**, **Step Functions**, and **ECS Fargate**. These are selected automatically based on your tier choice and flow requirements. The backend router handles the internal infrastructure mapping, ensuring your flows run with the right amount of compute resources.

Runtime Tiers

Dagy provides six runtime tiers designed for different workload characteristics. Each tier automatically selects the appropriate backend infrastructure:

Tier	Description	Max Duration	Resources	Best For
nano	All tasks in one invocation	10 minutes	Shared infrastructure	Development, testing, minimal workloads
micro	Each task runs independently	10 minutes per task	Shared infrastructure	Simple DAGs, short-lived pipelines
small	Dedicated compute	Unlimited	0.5 vCPU / 1 GB RAM	Small workloads, moderate complexity
medium	Dedicated compute	Unlimited	1 vCPU / 2 GB RAM	General-purpose workflows
large	Dedicated compute	Unlimited	2 vCPU / 4 GB RAM	Resource-intensive tasks, long pipelines
xlarge	Dedicated compute	Unlimited	4 vCPU / 12 GB RAM	High-compute workloads, large datasets

Execution Backends (Internal Implementation)

Dagy uses three execution backends internally to support the runtime tiers above. These backends are implementation details managed automatically by the framework and are documented here for infrastructure operators.

Lambda (Default)

Backend for: nano, micro tiers

Executes flows synchronously within the current Lambda invocation using the local executor with ThreadPoolExecutor-based timeouts.

Property	Value
Max duration	900 seconds (15 minutes)
Max memory	10,240 MB
Parallel tasks	No (sequential execution)
Native retry	No (handled by SDK LocalExecutor)
Cancellation	Yes (checked between tasks)
Cost	~$0.06/hr at 1 GB

Best for: Short-lived pipelines, simple DAGs, development/testing.

Lambda is always available as the default backend. No additional configuration is required.

Step Functions

Backend for: Reserved for complex orchestration scenarios (not directly exposed via runtime tiers)

Translates FlowSpec DAGs into Amazon States Language (ASL) state machines with native parallel execution and retry support.

Property	Value
Max duration	1 year (Standard workflows)
Max states	25,000
Parallel tasks	Yes (native Parallel state)
Native retry	Yes (ASL Retry policies)
Cancellation	Yes (`stop_execution()`)
Cost	Per state transition

Best for: Long-running pipelines (15 min – 1 hr), complex DAGs with many parallel tasks.

Configuration:

Environment Variable	Description	Required
`DAGY_SFN_ROLE_ARN`	IAM role ARN for Step Functions execution	Yes (enables backend)
`DAGY_SFN_TASK_EXECUTOR_ARN`	Lambda ARN that executes individual tasks	Yes

The Step Functions backend is registered automatically when DAGY_SFN_ROLE_ARN is set.

ASL translation details:

Sequential dependencies become Next transitions
Independent tasks at the same dependency level become Parallel branches
retries maps to ASL Retry with MaxAttempts
retry_delay_seconds maps to IntervalSeconds (integer only, fractional seconds are truncated)
retry_jitter_factor sets BackoffRate: 2.0 (exponential backoff); without jitter, BackoffRate: 1.0
timeout_seconds maps to ASL TimeoutSeconds
All errors are caught via Catch blocks that route to a HandleTaskFailure state

Note: List-based and callable delay strategies are not fully translated to ASL. Only the first interval value is used as a fixed IntervalSeconds.

ECS Fargate

Backend for: small, medium, large, xlarge tiers

Runs DAG flows as ECS Fargate tasks for resource-intensive or very long-running workloads.

Property	Value
Max duration	Unlimited
Max memory	120 GB
Max vCPU	16
Parallel tasks	No (managed by Dagy orchestrator)
Native retry	No (orchestrator handles)
Cancellation	Yes (`stop_task()`)
Streaming logs	Yes (CloudWatch Logs)
Cost	~$0.045/hr at 0.25 vCPU, 0.5 GB

Best for: Long-running workloads (> 1 hr), resource-intensive tasks needing > 10 GB memory.

Configuration:

Environment Variable	Description	Required
`DAGY_ECS_CLUSTER_ARN`	ECS cluster ARN	Yes (enables backend)
`DAGY_ECS_EXECUTION_ROLE_ARN`	Task execution role (for pulling images, logs)	No
`DAGY_ECS_TASK_ROLE_ARN`	Task role (permissions for your code)	No
`DAGY_ECS_WORKER_IMAGE`	Docker image for the worker	No (default: `dagy-worker:latest`)
`DAGY_ECS_SUBNETS`	Comma-separated subnet IDs	No
`DAGY_ECS_SECURITY_GROUPS`	Comma-separated security group IDs	No
`DAGY_ECS_LOG_GROUP`	CloudWatch log group	No (default: `/ecs/dagy-worker`)

The ECS backend is registered automatically when DAGY_ECS_CLUSTER_ARN is set.

Runtime Tier to Resource Mapping:

Runtime Tier	vCPU	Memory
small	0.5	1 GB
medium	1	2 GB
large	2	4 GB
xlarge	4	12 GB

Valid Fargate CPU/memory combinations (for custom configurations):

vCPU	Memory Options
0.25	512 MB, 1 GB, 2 GB
0.5	1–4 GB
1	2–8 GB
2	4–16 GB
4	8–30 GB
8	16–60 GB
16	32–120 GB

Runtime Tier Selection

The runtime tier is determined by a 6-level resolution strategy. The tier you select automatically maps to the appropriate execution backend:

Explicit request: per-run runtime_tier field in the API trigger
Deployment config: default_runtime_tier on the deployment
Environment config: default_runtime_tier on the environment
Flow config: runtime_tier on the @task decorator
Automatic selection rules: based on flow properties
Default: nano

Automatic Tier Selection Rules

When no explicit runtime tier is specified, the router evaluates these rules in priority order:

Priority	Rule	Condition	Selected Tier
100	Resource	Memory ≥ 3 GB	xlarge
90	Duration	Max timeout > 1 hr	large
80	Duration	Max timeout 15 min – 1 hr	medium
70	Complexity	Task count ≥ 50	large
-	Default	Everything else	micro

Per-Task Runtime Tier

You can specify a preferred runtime tier on individual tasks:

from dagy import task

@task(runtime_tier="large")
def memory_intensive(data: list) -> dict:
    # This task runs on the large tier (2 vCPU / 4 GB RAM)
    ...

The runtime_tier field is a hint; the backend router considers it alongside deployment and environment configuration.

Backend Capabilities Reference

This reference is for infrastructure operators and developers integrating with Dagy's backend systems directly.

Capability	Lambda	Step Functions	ECS Fargate
`max_duration_seconds`	900	31,536,000	Unlimited
`max_memory_mb`	10,240	Depends on compute	122,880
`max_tasks`	Unlimited	25,000	Unlimited
`supports_parallel`	No	Yes	No
`supports_native_retry`	No	Yes	No
`supports_cancel`	Yes	Yes	Yes
`supports_streaming_logs`	No	No	Yes