Back to docs
Overview

Dagy CLI Refactoring Plan: Enterprise-Grade Standards

This document identifies implementation gaps, inconsistencies, and missing enterprise patterns in the Dagy CLI (`src/dagy/cli/`), and outlines a prioritized plan to bring it to production-ready quality.

Executive Summary

This document identifies implementation gaps, inconsistencies, and missing enterprise patterns in the Dagy CLI (src/dagy/cli/), and outlines a prioritized plan to bring it to production-ready quality.


1. Implementation Gaps Identified

1.1 Missing --version Flag

Current: No way to check the installed SDK/CLI version. Expected: dagy --version should print dagy 0.1.0 (read from pyproject.toml or __version__).

Fix: Add parser.add_argument("--version", action="version", version=...) to build_parser() and expose __version__ in dagy/__init__.py.


1.2 No Global --verbose / --quiet Flags

Current: Verbose mode is controlled only via DAGY_LOCAL_VERBOSE env var. No --quiet option exists. Expected: dagy --verbose run ... and dagy --quiet run ... for session-level control.

Fix: Add --verbose and --quiet as mutually exclusive top-level arguments. Pass through to LocalLogger and use to suppress/enhance output in all commands.


1.3 No Structured Exit Codes

Current: All failures exit via SystemExit(1) or unhandled exceptions. No differentiation between auth errors, config errors, network errors, or user errors. Expected: Documented exit codes (e.g., 0=success, 1=general error, 2=usage error, 3=auth error, 4=network error, 5=not found).

Fix: Define an ExitCode enum and use it consistently across all cmd_* handlers. Wrap main() in a top-level try/except that maps exception types to exit codes.


1.4 No --output / --format Flag for Machine-Readable Output

Current: All output is human-readable print() statements. No JSON/YAML/CSV output option. Expected: dagy flows list --format json or dagy runs show <id> --format json for scripting and CI/CD integration.

Fix: Add --format {table,json,yaml,csv} to list/show commands. Abstract output formatting into a shared render_output() utility.


1.5 Incomplete Error Handling in cmd_run

Current: Remote runs print raw API response dict via print(response). No structured output, no error handling for failed triggers, no run-ID extraction. Expected: Parse the response, display the run ID, and handle API errors gracefully.

Fix:

# Before (line 111):
print(response)

# After:
run_id = response.get("run_id", "unknown")
print(f"Run triggered: {run_id}")

1.6 Missing Subcommand Help for runs and flows

Current: Running dagy runs or dagy flows without a subcommand silently falls through (no func attribute → prints top-level help, not subcommand help). Expected: dagy runs should print the runs subcommand help.

Fix: Add runs.set_defaults(func=lambda _: runs.print_help()) and flows.set_defaults(func=lambda _: flows.print_help()).


1.7 cmd_logout Does Not Revoke Server-Side Token

Current: cmd_logout only deletes the local credentials file. The DagyClient.logout() method exists but is never called. Expected: Attempt server-side token revocation, then delete local credentials regardless.

Fix:

def cmd_logout(args):
    api_url = configured_api_url(args.profile)
    if api_url:
        try:
            DagyClient(api_url).logout()
        except Exception:
            pass  # Best-effort; still delete local creds
    delete_credentials()
    print("Logged out")

1.8 cmd_login Stores expires_at=0 and user_email=""

Current: save_credentials(result.token, 0, ""). Expiry and email are discarded. Expected: Decode the JWT to extract exp and email claims, or request them from the callback.

Fix: Either decode the JWT payload (base64, no verification needed for local display) or extend the OAuth callback to include expires_at and email query parameters. Display Logged in as <email> on success.


1.9 No Token Expiry Check Before API Calls

Current: DagyClient._resolve_token() returns whatever is stored, even if expired. Expected: Check expires_at before using. If expired, prompt re-login with a clear message.

Fix: Add expiry validation in _resolve_token() and raise a TokenExpiredError that the CLI catches and converts to a user-friendly message.


1.10 LocalMetadataStore Is Never Used as a Context Manager

Current: Every command manually calls store.close(). Some error paths may skip it. Expected: Use with statements for guaranteed cleanup.

Fix: Add __enter__/__exit__ to LocalMetadataStore and update all call sites.


1.11 Hardcoded User-Agent String

Current: user_agent: str = "dagy-sdk/0.1.0" in DagyClient. Expected: Auto-populated from package version.

Fix: Read from importlib.metadata.version("dagy") and include Python version: dagy-sdk/0.1.0 python/3.12.1.


1.12 No Confirmation or Dry-Run for deploy

Current: dagy deploy immediately uploads to S3 and registers with the API. Expected: --dry-run flag to show what would happen without executing. Optionally, a confirmation prompt for production deployments.

Fix: Add --dry-run and --yes flags to the deploy subparser.


1.13 No whoami Command

Current: No way to verify the current authenticated identity. Expected: dagy whoami shows the logged-in user's email and token expiry.

Fix: Add a whoami subcommand that reads credentials and displays identity info.


1.14 runs list Has No Filtering or Limit

Current: cmd_runs_list dumps all local runs with no filtering. Expected: --limit, --status, --flow flags for filtering.

Fix: Add arguments and pass them to the store query.


1.15 Missing --profile Flag on Several Commands

Current: build, runs list, runs show, logs, and config lack --profile. Expected: All commands that could benefit from profile context should accept it.

Fix: Add --profile to runs, logs, and build subparsers where API URL or local dir resolution applies.


2. Inconsistencies

IssueLocationSeverity
Import ordering mixes stdlib, third-party, and local without consistent groupingcli/main.py lines 1-36Low
_select_option falls through without return when stdin is non-tty and no valid selection is made (infinite loop)cli/main.py line 206Medium
_resolve_app_url can never return None (returns _DEFAULT_APP_URL as fallback) but cmd_login checks for Nonecli/main.py lines 117-148Low
cmd_run decides local vs. remote based solely on api_url being set, not on the flow target formatcli/main.py line 98Medium
deploy requires --flow-name and --flow-version even though this info is inside the artifact zipcli/main.py line 498-499Medium
runs list and runs show don't use tabulate while flows list doescli/main.py lines 167-190 vs 292-324Low
No --profile on runs and logs commandscli/main.pyLow
OAuth callback handler uses class-level mutable state (thread-unsafe for concurrent tests)cli/oauth.py lines 27-28Medium

3. Prioritized Refactoring Roadmap

Phase 1: Critical (Week 1)

  1. Add --version flag: 15 min, high visibility
  2. Fix exit codes: Define ExitCode enum, wrap main() in structured error handler
  3. Fix cmd_logout: Call DagyClient.logout() before deleting local creds
  4. Fix cmd_run remote output: Parse response, display run ID, handle errors
  5. Add subcommand help defaults: runs and flows show their own help

Phase 2: High Value (Week 2)

  1. Add --format json output: Enable CI/CD scripting
  2. Add --verbose / --quiet: Session-level log control
  3. Add whoami command: Identity verification
  4. Improve cmd_login: Decode JWT for email/expiry, display on success
  5. Add token expiry check: Prevent silent auth failures

Phase 3: Polish (Week 3)

  1. Add --dry-run to deploy: Safety for production deployments
  2. Add filtering to runs list: --limit, --status, --flow
  3. Make LocalMetadataStore a context manager: Guaranteed cleanup
  4. Auto-detect flow-name/flow-version from artifact: Reduce required flags
  5. Unify table formatting: Use tabulate consistently across all list/show commands

Phase 4: Hardening (Week 4)

  1. Fix _select_option infinite loop: Add max retry or graceful exit
  2. Fix import ordering: Apply isort rules consistently
  3. Dynamic user-agent: Read version from package metadata
  4. Add --profile to all applicable commands
  5. Thread-safe OAuth handler: Use instance state instead of class-level

4. Testing Requirements

Each refactoring item above should include:

  • Unit tests covering the happy path and at least one error case
  • Integration test updates for any changed CLI argument signatures
  • Snapshot tests for any new --format json output schemas

Target: maintain the existing 90% coverage floor defined in pyproject.toml.


5. Migration & Backward Compatibility

All changes in this plan are additive. No existing flags, commands, or behaviors are removed. Existing scripts that call dagy build, dagy deploy, dagy run, etc. will continue to work unchanged. New flags (--version, --format, --verbose, --dry-run) are optional with sensible defaults.