Back to docs
Operations

Deployment

Infrastructure deployment process for Dagy.

Control Plane API

The control plane API is deployed as container-based AWS Lambda functions behind an HTTP API Gateway. The CDK stack in infrastructure/ provisions:

  • Database tables for flows, deployments, schedules, runs, task runs, users, access tokens, and access logs.
  • S3 artifacts bucket for packaged flow artifacts and flow specs.
  • Shared SQS events queue (15-minute visibility timeout) for all Dagy event types.
  • ECR image repository is ensured by infrastructure/publish_image.sh before deploy.
  • Lambda function for API handling.
  • Lambda function for scheduler polling.
  • Lambda version + alias resources for both functions.
  • HTTP API routes ANY / and ANY /{proxy+}.
  • Event source mapping from the events queue to the API Lambda alias.
  • EventBridge rule rate(1 minute) targeting the scheduler Lambda alias with payload:
    • {"dagy_event_type":"scheduler_tick","max_due":100}

Lambda Versioning and Alias Policy

Deployment rules are strict:

  • Every deployment publishes a new Lambda version.
  • The alias is updated to the new version.
  • API Gateway and all event integrations invoke aliases only.
  • No integration invokes $LATEST or an unqualified function ARN.

Async Event Model

  • POST /runs enqueues run_execute events to the shared events queue.
  • Queue/EventBridge consumers inspect dagy_event_type and route handlers (flow_trigger, run_execute, run_retry, run_status_update, scheduler_tick).
  • scheduler_tick queries DAGY_SCHEDULES for due enabled schedules and triggers corresponding flow runs.
  • Flow execution writes structured logs to CloudWatch:
    • Log group: /dagy/<flow_name>
    • Log stream: <run_slug>/<version>/<run_id>

CDK Configuration

Environment-specific settings live in infrastructure/<environment>.yml. The artifacts bucket name is enforced as <app-name>-<service-name>-<account_id>-<region>-<environment>.

Dagy-specific settings can be placed under a dagy key. Supported keys:

  • jwt_required
  • jwt_issuer
  • jwt_audience
  • jwks_url
  • api_cors_allowed_origins
  • vpc_id
  • subnet_ids
  • security_group_ids
  • lambda_image_uri
  • lambda_image_tag
  • ecr_repository_name
  • ecr_push_principals

Example:

dagy:
  jwt_required: true
  jwt_issuer: "https://issuer.example.com/"
  jwt_audience: "dagy-api"
  jwks_url: "https://issuer.example.com/.well-known/jwks.json"
  api_cors_allowed_origins:
    - "https://app.example.com"
  ecr_repository_name: "dagy-service-worker-develop"
  lambda_image_tag: "latest"

CDK Usage

From infrastructure/:

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cdk synth -c environment=develop -c app_version=<build-version>
cdk deploy -c environment=develop -c app_version=<build-version>

Always pass a deployment-specific app_version (for example build number or commit SHA) so each deployment creates new Lambda versions.

Image publishing:

  • infrastructure/publish_image.sh builds and pushes the Lambda image to ECR.
  • infrastructure/cdk-entrypoint.sh is configured in infrastructure/cdk.json and runs this before app.py, so cdk deploy publishes the image first.
  • To disable auto-build (for pre-published images), set -c auto_build_image=false or DAGY_AUTO_BUILD_IMAGE=false.