Back to docs
Concepts

Namespaces, Versioning & Tags

Three metadata dimensions organize flows across teams, environments, and release cycles.

DimensionTypeMutableDefaultWhere set
NamespacestrNo (set at registration)CWD relative to git root@flowdeploy(), CLI --namespace
VersionstrNo (immutable per registration)"latest"@flow(version=...), CLI --flow-version
TagsDict[str, str]Yes (via re-deploy / API){}deploy(tags=...), API body

Namespaces

A namespace is a hierarchical string that groups related flows. It maps naturally to directory structure, team boundaries, or domain areas.

How namespaces are derived

When you call flow.deploy() or dagy deploy, the namespace is resolved in this order:

  1. Explicit value: if you pass namespace="data/ingestion", that value is used as-is.
  2. Auto-derived: if omitted, Dagy computes the namespace from the source file's directory path relative to the current working directory.
# File: pipelines/data/ingestion/load_orders.py
# CWD:  pipelines/

result = my_flow.deploy()
# namespace → "data/ingestion"  (auto-derived from CWD-relative path)

Auto-derivation logic (src/dagy/core/flow.py):

src = Path(inspect.getfile(fn)).resolve()
rel = src.parent.relative_to(Path.cwd())
namespace = str(rel) if str(rel) != "." else ""

If the source file is at the CWD root, namespace is an empty string.

Setting namespaces

Python SDK:

result = my_flow.deploy(
    name="load-orders",
    namespace="data/ingestion",
)

CLI:

dagy deploy ./dist/artifact.zip \
  --deployment load-orders \
  --namespace "data/ingestion"

API (POST /flows):

{
  "flow_spec": { "name": "load_orders", "version": "1.0.0", "..." : "..." },
  "namespace": "data/ingestion",
  "deployment_name": "load-orders"
}

Namespace conventions

PatternExampleUse case
team/projectdata-eng/etlTeam-scoped ownership
domain/subdomainbilling/invoicesDomain-driven design
environmentstagingEnvironment separation
Flat"" (empty)Small projects, single team

Namespaces are stored with the flow record and surfaced as filterable chips in the Flows UI. Clicking a namespace chip filters the grid to matching flows.

API Fields

ModelField
FlowRegisterRequestnamespace: Optional[str]
FlowListItemnamespace: Optional[str]
FlowDetailResponsenamespace: Optional[str]

Versioning

Every flow registration is identified by a composite key: flow_name + flow_version.

Version format

Versions are free-form strings. Dagy does not enforce semver, but these are the common patterns:

StrategyExampleWhen to use
"latest"my_flow:latestDevelopment, CI/CD auto-deploy (default)
Semantic"1.0.0", "2.3.1"Production releases with explicit rollback points
Build hash"abc123"Immutable builds tied to commit SHAs
Date-based"2026-03-08"Nightly builds

How versions flow through the system

@flow(version="1.0.0")       ← declared in code
    │
    ▼
flow.build()                  ← baked into FlowSpec.version
    │
    ▼
flow.deploy()                 ← uploaded as flow_version="1.0.0"
    │                            (or "latest" if not specified in @flow)
    ▼
Flow(flow_name,               ← persisted
     flow_version)
    │
    ▼
Deployment                     ← deployment pins to a specific flow_version
    │
    ▼
Run.flow_version               ← each run records which version it executed

Setting the version

In the @flow decorator:

@flow(name="my_pipeline", version="1.0.0")
def my_pipeline():
    ...

The version field is embedded in FlowSpec at build time. If omitted, defaults to "latest".

At deploy time (CLI):

dagy deploy ./dist/artifact.zip \
  --deployment my-pipeline \
  --flow-name my_pipeline \
  --flow-version 1.0.0

!!! note The CLI currently hardcodes flow_version="latest" when auto-detecting from the artifact. Use --flow-version to override with an explicit version.

Triggering a specific version:

# Python client
client.trigger_run({
    "flow_name": "my_pipeline",
    "flow_version": "1.0.0",
    "parameters": {"source": "s3://bucket/data.csv"},
})
# API
curl -X POST https://api.dagy.io/v1/runs \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "flow_name": "my_pipeline",
    "flow_version": "1.0.0",
    "parameters": {}
  }'

Version immutability

A flow_name:flow_version pair is immutable once registered. Re-deploying the same pair overwrites the artifact and spec (the record is upserted), but the version string itself cannot be changed after creation. To release a new version of a flow, register it under a new flow_version.

Deployment version counter

Each deployment also tracks a deployment_version (integer), which auto-increments on every re-deploy. This is separate from flow_version and represents how many times a deployment has been updated, useful for audit trails.

Code hash (change detection)

Every artifact built by dagy build or flow.deploy() includes a code_hash in its metadata.json. The hash is computed as:

SHA256( canonical_json(flow_spec, sort_keys=True) + "|" + source_file_hash )

This captures both DAG structure changes (tasks, edges, timeouts, retries) and source code changes (the Python file that defines the flow).

When deploying, the CLI and SDK compare the local code_hash against the remote code_hash of the currently deployed version. If they match, the deployment is skipped with a message. Use --force (CLI) or force=True (SDK) to override.

ScenarioBehavior
First deploy (no remote hash)Deploys normally
Code or spec changedDeploys normally
Nothing changedSkipped (unless --force)
Remote flow has no code_hash (legacy)Deploys normally

API Fields

ModelFieldRole
FlowSpecModel.versionstrEmbedded in spec JSON
FlowRegisterResponse.flow_versionstrReturned after registration
RunResponse.flow_versionstrReturned with run status
FlowModel.code_hashstr (optional)SHA-256 of flow spec + source code

Tags

Tags are arbitrary key-value pairs (Dict[str, str]) attached to flows, deployments, and DAG drafts. They provide a flexible, non-hierarchical labeling system for filtering, searching, and organizing resources.

Setting tags

Python SDK (deploy()):

result = my_flow.deploy(
    name="load-orders-prod",
    tags={
        "team": "data-eng",
        "env": "prod",
        "cost-center": "analytics",
        "criticality": "high",
    },
)

API (POST /flows):

{
  "flow_spec": { "..." : "..." },
  "deployment_name": "load-orders-prod",
  "tags": {
    "team": "data-eng",
    "env": "prod",
    "cost-center": "analytics"
  }
}

Deployments API (POST /deployments):

{
  "name": "load-orders-prod",
  "flow_name": "load_orders",
  "flow_version": "1.0.0",
  "tags": {
    "team": "data-eng",
    "env": "prod"
  }
}

DAG Drafts (POST /drafts):

{
  "name": "My Draft Pipeline",
  "canvas_json": "{ ... }",
  "tags": {
    "project": "migration",
    "status": "wip"
  }
}

!!! info "CLI limitation" The dagy deploy CLI does not currently expose a --tags flag. Use the Python SDK or API to set tags. Tags set via the SDK's deploy() method are passed through to the API.

Tag conventions

KeyExample valuesPurpose
teamdata-eng, ml-ops, platformOwnership
envdev, staging, prodEnvironment targeting
cost-centeranalytics, billingCost allocation
criticalityhigh, medium, lowOperational priority
projectmigration-v2, q1-launchProject tracking
data-classificationpii, public, internalCompliance

Tags in the UI

In the Flows page, tags appear as clickable chips on each flow card. Clicking a tag chip filters the grid to show only flows with that tag. Multiple tag filters are combined with AND logic. Click team:data-eng and env:prod to see only production data-eng flows.

API Fields

API models that carry tags:

ModelField
FlowRegisterRequesttags: Optional[Dict[str, str]]
FlowRegisterResponsetags: Optional[Dict[str, str]]
FlowListItemtags: Optional[Dict[str, str]]
FlowDetailResponsetags: Optional[Dict[str, str]]
DeploymentRequesttags: Optional[Dict[str, str]]
DagDraftCreateRequesttags: Optional[Dict[str, str]]
DagDraftUpdateRequesttags: Optional[Dict[str, str]]
DagDraftResponsetags: Optional[Dict[str, str]]

Putting it all together

A typical production setup combines all three dimensions:

from dagy import flow, task


@task
def extract(source: str) -> list:
    ...

@task
def transform(records: list) -> list:
    ...

@task
def load(records: list) -> dict:
    ...


@flow(name="order_etl", version="2.1.0")
def order_etl(source: str = "s3://data/orders/"):
    raw = extract(source)
    clean = transform(raw)
    return load(clean)


# Deploy with full metadata
result = order_etl.deploy(
    name="order-etl-prod",
    namespace="data/orders",
    tags={
        "team": "data-eng",
        "env": "prod",
        "criticality": "high",
    },
    schedule="0 6 * * *",
)

This creates:

FieldValue
flow_nameorder_etl
flow_version2.1.0
deployment_nameorder-etl-prod
namespacedata/orders
tags{"team": "data-eng", "env": "prod", "criticality": "high"}

In the Flows UI, this flow appears under the data/orders namespace group with three tag chips. Team members can filter by any combination to quickly find what they need.

API lookup patterns

What you wantEndpointKey fields
Get a specific flow versionGET /flows/{flow_name}/{flow_version}flow_name + flow_version
List all flows (with namespace/tags in response)GET /flowsPaginated, includes namespace and tags
Trigger a run for a specific versionPOST /runsflow_name + flow_version in body
Trigger a run via deploymentPOST /runsdeployment in body (version resolved from deployment)
List runs filtered by flowGET /runs?flow_name=order_etlIndexed query