Flow Builder User Guide
Welcome to the Dagy Flow Builder – a powerful, intuitive visual interface for creating data pipelines without writing code. This guide will walk you through everything you need to know to build, deploy, and manage your data workflows.
Table of Contents
- Overview
- Interface Layout
- The Task Node
- Connectors & Typed Data Flow
- Building Your First DAG
- Edge Creation & Dependencies
- Task Configuration Panel
- Flow Configuration Panel
- Validation
- Custom Nodes
- Saving & Loading
- Deploying from the Builder
- Editing Existing Flows
- Keyboard Shortcuts
- Best Practices
- Troubleshooting
Overview
What is the Flow Builder?
The Flow Builder is a visual, drag-and-drop interface for creating Directed Acyclic Graphs (DAGs) – data pipelines that process information through a series of interconnected tasks. Built with React Flow, it provides an intuitive canvas where you can design complex workflows without writing a single line of code.
Who is it for?
- Data Analysts who want to build pipelines without learning Python
- Business Users who need to orchestrate data workflows
- Data Engineers who want a faster way to prototype pipelines
- Teams that need to collaborate on pipeline design
When to Use the Visual Builder vs Code SDK
Use the Flow Builder when you:
- Want a quick visual overview of your pipeline
- Are building relatively simple to moderately complex workflows
- Need to collaborate with non-technical team members
- Want to iterate rapidly without coding
- Prefer drag-and-drop simplicity
Use the Code SDK when you:
- Need fine-grained control over task behavior
- Are building highly complex, conditional workflows
- Have custom logic that doesn't fit pre-built nodes
- Want version control and code review for your pipelines
- Are integrating with existing Python codebases
Accessing the Builder
You can access the Flow Builder in two ways:
- From the Dashboard: Click the "New Flow" button in the left sidebar (usually at the top or under a "Create" section)
- Direct URL: Navigate directly to
/flow-builderin your Dagy instance
Interface Layout
The Flow Builder is organized into four main areas, designed to make pipeline creation intuitive and efficient.
Left Sidebar: Node Panel
The left sidebar is a compact 180px strip containing the node panel. Features include:
- Task Node: A single draggable Task node that serves as the universal node type
- Drag Interface: Click and drag the Task node onto the canvas to add it to your workflow
- Simplicity: With a single node type, you focus on configuration rather than selection
To add a task to your pipeline, simply drag the Task node from the library onto the canvas, then configure it with a Python import path that defines what it does.
Center: Canvas
The canvas is the main workspace where you build your DAG:
- Grid Background: Helps with alignment and visual organization
- Nodes: Appear as rectangular blocks with connector handles
- Connector Handles: Small colored circles on node edges. Hover over any handle to see its name, accepted data types, and whether it's required
- Edges: Lines connecting specific connectors between nodes that show typed data flow
- Minimap: Small preview in the corner showing your entire workflow
- Zoom Controls: Buttons (+/-) or scroll wheel to zoom in/out
- Pan: Click and drag on empty canvas to move around your workflow
Right Sidebar: Task Configuration Panel
When you click on a node, the right panel opens with two tabs:
Config Tab - Core task configuration:
- Task Name: Editable field to identify the task (must be unique within the DAG)
- Description: Optional explanation of what the node does
- Import Path: The Python module and function to execute (format:
module:function) - Retries: Number of retry attempts (0-10, default: 0)
- Retry Delay: Seconds to wait between retry attempts
- Timeout: Maximum execution time in seconds
- Concurrency Limit: How many instances can run in parallel
- Action Buttons: Duplicate or delete the node
Code Tab - Generated Python code:
- Shows the auto-generated
@taskdecorator and function signature with proper arguments - Displays Python code with syntax highlighting
- Read-only view of what will be executed
- Helps visualize the task in code form
The panel closes when you click elsewhere on the canvas or on another node.
Top Toolbar
The toolbar at the top provides quick access to workflow management:
- Flow Name Display: Shows the current flow name with an "Editing" badge when editing an existing flow
- Settings Button (Settings2 icon): Opens the Flow Configuration Panel
- Save Button: Manually save your current draft (or press Ctrl+S)
- Deploy Button: Prepare your DAG for production
- Undo/Redo Buttons: Navigate through your editing history
- Validation Button: Check for issues before saving/deploying
The Task Node
The Flow Builder uses a single, universal Task node type. Instead of choosing from 28+ specialized node types, you drag a Task node onto the canvas and configure it with a Python import path that defines what it does.
Why a Single Node Type?
- Simplicity: No need to search through dozens of node categories
- Flexibility: Any Python function can be executed, from data ingestion to transformations to notifications
- Consistency: All tasks are configured the same way
- Extensibility: Add new capabilities without modifying the builder
How Task Nodes Work
- Drag a Task node from the left sidebar onto the canvas
- Configure it with a name and import path (e.g.,
s3_operations:read_csv) - Connect it to other tasks via edges
- View the generated Python code in the Code tab
The import path determines what your task does:
s3_operations:read_csv– reads CSV from S3data_transforms:filter_active– filters active recordsnotifications:send_slack_message– sends Slack notificationsml_models:predict– runs ML inference- Any custom Python function you write
Task Node Anatomy
A Task node on the canvas displays:
- Task Name (center): The identifier for this task
- Top Connector (Inbound): Input data from upstream tasks
- Bottom Connector (Outbound): Output data to downstream tasks
- Configuration Indicator: Shows if the task is fully configured
Connectors & Typed Data Flow
Every node in the Flow Builder has typed connectors — small handles on the top (inputs) and bottom (outputs) of each node. Connectors carry type information that determines what connections are valid.
How Connectors Work
Each connector has a name, description, data types, cardinality, and required flag. When you hover over any connector handle, a tooltip shows this information. The Flow Designer only allows connections between compatible connectors.
Data Types
The system supports 15 data types. When you draw an edge, the source connector's data types must be compatible with the target connector's types:
- any — Universal wildcard, connects to everything
- string — Text data (also compatible with json and document targets)
- number — Numeric data
- boolean — True/false values
- json — Structured JSON objects
- dataframe — Tabular data (also compatible with json and list targets)
- list — Arrays/sequences (also compatible with json targets)
- embedding — Vector embeddings (also compatible with list targets)
- image, audio, document — Media types
- event — Event payloads (also compatible with json targets)
- trigger — Execution signals with no data payload
- error — Error information for error handling channels
Connection Validation
When you drag an edge from one node to another, the system validates the connection in real time:
- Direction check: You must connect an outbound connector (bottom) to an inbound connector (top)
- Type compatibility: At least one source data type must match or be coercible to a target data type
- Cardinality check: The connector hasn't exceeded its maximum number of connections
If a connection is invalid, the edge is silently rejected. Check the browser console for specific rejection reasons.
Node Configuration Properties
Every task, regardless of its purpose, is configured with these properties:
| Property | Type | Required | Description |
|---|---|---|---|
| name | string | Yes | Unique identifier for the task within the DAG |
| description | string | No | Human-readable explanation of the task's purpose |
| import_path | string | Yes | Python module and function (format: module:function) |
| retries | integer | No | Number of retry attempts (0-10, default: 0) |
| retry_delay | integer | No | Seconds to wait between retry attempts |
| timeout | integer | No | Maximum execution time in seconds |
| concurrency_limit | integer | No | Maximum parallel instances of this task |
Building Your First DAG
Let's walk through creating a simple data pipeline step-by-step.
Step 1: Access the Builder
Navigate to the Flow Builder by clicking "New Flow" in the sidebar or visiting /flow-builder.
Step 2: Add Your First Task Node
- Look at the left sidebar – you'll see the Task node in the node panel
- Drag the Task node onto the center canvas
- Release to drop the node
You'll see a rectangular node appear on the canvas with:
- A node title (default: "Task")
- A connector on the top (input)
- A connector on the bottom (output)
Step 3: Add More Task Nodes
Let's build a three-task pipeline:
- Drag another Task node onto the canvas to the right of the first one
- Drag a third Task node to the right of the second one
You now have three nodes ready to be configured and connected.
Step 4: Configure Each Task
Start with the first task:
- Click on the first Task node – the right panel will open showing its configuration
- Change the Name from "Task" to something descriptive like "Load Customer Data"
- Enter a Description: "Reads customer CSV from S3 bucket"
- Set Import Path:
s3_operations:read_csv - Click the Code tab to see the generated Python code
- Click elsewhere to close the configuration panel
Now configure the second task:
- Click on the second Task node
- Set Name: "Filter Active Customers"
- Set Description: "Keeps only customers with active status"
- Set Import Path:
data_transforms:filter_active - Click elsewhere to save
Finally, configure the third task:
- Click on the third Task node
- Set Name: "Notify Team"
- Set Description: "Send summary to #data-ops channel"
- Set Import Path:
notifications:send_slack_message - Click elsewhere to save
Step 5: Connect the Tasks
Now connect your nodes to show data flow:
- Hover over the bottom (output) of the "Load Customer Data" node – you'll see the connector highlight
- Click and drag from this connector to the top (input) of the "Filter Active Customers" node
- An edge (line) appears connecting the two nodes
- Repeat: Connect "Filter Active Customers" to "Notify Team"
Your pipeline is now complete: Load Data → Filter Data → Notify Team
Step 6: Review Your DAG
- Check the canvas – you should see three connected nodes
- Use the minimap in the corner to see the overall structure
- Use scroll or zoom controls to get a better view
Step 7: Save Your Work
Click the Save button in the top toolbar (or press Ctrl+S). Your draft is now stored and you can return to it later.
Edge Creation & Dependencies
Edges are the connections between nodes that represent both data flow and execution order.
Creating Edges
Method 1: Drag from Connector Handle
- Hover over a connector handle (bottom of the source node for outputs, top of the target node for inputs)
- A tooltip appears showing the connector name and accepted data types
- Click and drag from the source connector to the target connector
- Release to create the edge — the system validates type compatibility automatically
- If the types are incompatible, the connection is rejected silently
Method 2: Smart Connection
- If a node has only one output connector and the target has only one input connector, React Flow will snap to the correct handles automatically
- For nodes with multiple connectors, aim for the specific handle you want
Understanding Data Flow
- Direction: Edges flow from outbound connectors (bottom) to inbound connectors (top)
- Typed connections: Each edge carries data between specific typed connectors
- Execution Order: Target nodes don't execute until their source nodes complete
- Animated Edges: Edges animate to show direction of data flow
- Multiple Inputs: An inbound connector can accept edges from multiple sources (depending on cardinality)
- Multiple Outputs: An outbound connector can send edges to multiple targets
Deleting Edges
- Click on an edge (the line between nodes) to select it
- Press Delete or Backspace to remove it
- The nodes remain; only the connection is deleted
Creating Complex Pipelines
Edges enable sophisticated workflows:
- Fan-out: One node connects to multiple downstream nodes (parallel processing)
- Fan-in: Multiple nodes connect to one downstream node (data merging)
- Sequential: Nodes connect in a line (serial processing)
- Diamond Pattern: Node A → B and C, then B → D and C → D (merge and split)
Task Configuration Panel
The right sidebar panel provides detailed control over each task's behavior. Understanding these options will help you build robust pipelines.
Config Tab: Basic Identification
Name (Required)
- Must be unique within your DAG
- Should be descriptive (e.g., "Load User Data" not "Task 1")
- Used in logs, alerts, and execution reports
- Changed by editing the text field in the configuration panel
Description (Optional)
- Free-form text explaining the task's purpose
- Useful for team collaboration
- Appears in tooltips when hovering over nodes
- Helps future maintainers understand your pipeline
Config Tab: Import Path (Required)
The Import Path tells Dagy which Python function to execute for your task.
Format: module_name:function_name
Examples:
customer_data:load_from_s3– callsload_from_s3()fromcustomer_datamoduletransformations:clean_text– callsclean_text()fromtransformationsmoduleml_models:predict– callspredict()fromml_modelsmodule
The function must:
- Be defined in a Python module your Dagy instance can access
- Accept parameters passed by the DAG
- Return data for downstream nodes
- Be decorated with
@taskif using the Python SDK
Config Tab: Retry Settings
Retries (Default: 0)
- Number of times to re-run a failed task
- Range: 0-10 attempts
- Useful for flaky network operations or transient errors
- Example: Set to 3 for API calls that occasionally timeout
Retry Delay (Default: 0)
- Seconds to wait between retry attempts
- Example: Set to 5 for API calls with rate limiting
- Allows external systems time to recover
Config Tab: Timeout Settings
Timeout (No default)
- Maximum execution time in seconds
- Task fails if it doesn't complete within this window
- Example: Set to 300 for a 5-minute maximum
- Prevents hanging tasks from blocking your pipeline
Config Tab: Concurrency Control
Concurrency Limit (No default)
- Maximum number of parallel instances of this task
- Example: Set to 1 to ensure sequential execution
- Set to 5 to allow up to 5 simultaneous runs
- Useful when tasks have resource constraints
Config Tab: Action Buttons
Duplicate Button
- Creates an exact copy of the current node
- Copy appears on the canvas with "_copy" suffix
- Useful for similar tasks in parallel branches
- Also available via Ctrl+D
Delete Button
- Removes the node and all its connected edges
- Cannot be undone (use Undo button to recover)
- Useful for removing experimental nodes
Code Tab: Generated Python Code
The Code tab displays the auto-generated Python code for your task:
@task(
name="Load Customer Data",
retries=0,
timeout=None,
concurrency_limit=None
)
def load_customer_data():
"""Reads customer CSV from S3 bucket"""
# Import and call the function from your module
from s3_operations import read_csv
return read_csv()
This read-only view shows:
- The
@taskdecorator with your configuration - The function name (derived from your task name)
- Configuration parameters (retries, timeout, concurrency_limit)
- The import statement and function call
- Helpful for understanding what will be executed
The Code tab is automatically updated when you change your task configuration in the Config tab.
Flow Configuration Panel
The Flow Configuration Panel contains flow-level settings that apply to the entire pipeline. Access it by clicking the Settings2 icon (gear icon) in the toolbar.
Flow Name
- Required for deployment
- The identifier for your entire pipeline
- Used in logs, dashboards, and execution history
- Appears in the toolbar when editing
Flow Version
- Semantic versioning (e.g., 1.0.0, 1.1.0, 2.0.0)
- Auto-suggested when deploying edited flows
- Helps track pipeline evolution
- Appears in deployment history
Flow Description
- Optional explanation of what this flow does
- Useful for team collaboration
- Helps future maintainers understand the pipeline's purpose
Executor Selection
Choose where tasks in this flow execute:
- Lambda: Fast, serverless execution. Good for short tasks (< 5 min), low memory (< 512MB), stateless operations
- Step Functions: AWS-native orchestration. Good for AWS-integrated workflows and state machines
- ECS: Containerized execution. Best for long-running tasks (> 15 min), memory-intensive operations, or custom Docker images
This setting is the default for all tasks; individual tasks can override this if needed.
Environment Selector
- Select the execution environment (e.g., development, staging, production)
- Determines which database, API keys, and configuration your tasks access
- Helpful for testing flows before production deployment
Summary Badges
Quick visual indicators showing:
- Node Count: Number of tasks in the flow
- Status: Draft, Published, or Editing
- Last Modified: Timestamp of the last change
Validation
Before saving or deploying your DAG, the builder validates it for common issues. Understanding validation helps you catch problems early.
Validation Checks
The builder performs these checks:
1. Cycle Detection (Kahn's Algorithm)
- What it checks: Ensures your DAG has no circular dependencies
- Why it matters: A cycle would cause infinite loops
- Example error: "Task A → B → A" creates a cycle
- How to fix: Review your connections and remove the circular edge
2. Required Fields
- What it checks: Every task must have a name and import_path
- Why it matters: These fields are essential for execution
- Example error: "Node 'Task 1': name is required"
- How to fix: Click the node and fill in the missing fields
3. Orphan Detection
- What it checks: Identifies nodes with no incoming or outgoing edges
- Why it matters: Orphaned nodes never execute
- Example warning: "Node 'Old Task' has no connections"
- How to fix: Either connect the node or delete it
4. Duplicate Labels
- What it checks: No two nodes can have the same name
- Why it matters: Names are unique identifiers for execution tracking
- Example error: "Duplicate node name: 'Process Data' appears 2 times"
- How to fix: Rename one of the duplicate nodes
5. Import Path Validation
- What it checks: Attempts to verify the module:function exists
- Why it matters: Invalid paths cause runtime execution failures
- Example error: "Cannot resolve import_path 'nonexistent:function'"
- How to fix: Verify the module name and function exist in your codebase
6. Connection Type Compatibility
- What it checks: Source connector data types must be compatible with target connector types
- Why it matters: Type mismatches cause runtime data errors
- How it works: The builder prevents invalid connections during edge creation. Existing edges are re-validated when you run validation.
Running Validation
Automatic Validation
- Runs before every save and deploy
- Displays issues in a panel below the canvas
- Does not prevent saving (warnings only)
Manual Validation
- Click the "Validation" button in the top toolbar
- Opens a detailed validation report
- Shows all errors, warnings, and successful checks
Reading Validation Output
Validation results appear in a panel with:
Errors (Red)
- Must be fixed before deployment
- Examples: cycles, missing required fields, duplicates
Warnings (Yellow)
- Do not prevent deployment but should be reviewed
- Examples: orphaned nodes, unusual configurations
Info (Blue)
- Informational messages about your DAG
- Examples: "DAG is valid", "Total nodes: 5"
Invalid Nodes Highlighting
When validation finds issues:
- Error nodes appear with a red border
- Error edges appear in red
- Affected nodes are highlighted on the canvas
- Click a highlighted node to see details in the configuration panel
Custom Nodes
Beyond the built-in Task node, your organization can create custom node types that appear in the Flow Designer. Custom nodes extend functionality for domain-specific use cases.
What Are Custom Nodes?
Custom nodes are specialized processing steps built by your team. They extend the FlowNode base class in Python, defining their own connectors, configuration schema, and execution logic. Once registered, they appear in the sidebar and can be used just like built-in nodes — drag them onto the canvas, configure them, and connect them to other nodes.
How to Create Custom Nodes
Creating a custom node involves writing a Python class that extends FlowNode and implements three methods: metadata() (identity), connectors() (typed ports), and execute() (logic). The framework auto-registers your class when Python imports the module.
For the full step-by-step tutorial, see the Creating Custom Nodes guide.
How Custom Nodes Appear in the Builder
Custom nodes registered via the API or Python SDK are fetched automatically when you open the Flow Designer. They appear in the sidebar alongside the Task node. You can drag and configure them just like any other node.
Managing Custom Nodes via API
Your organization can register, update, and delete custom nodes through the Node Registry API:
POST /nodes/registry— Register a new custom nodePUT /nodes/registry/{node_type}— Update an existing custom nodeDELETE /nodes/registry/{node_type}— Remove a custom node
See the API Endpoints reference for full details.
Saving & Loading
The builder automatically and manually saves your work, allowing you to iterate safely and return to previous drafts.
Auto-Save
How it works:
- Every 30 seconds, the builder saves your canvas state automatically
- Saves as a draft record in the database
- Happens silently in the background
- You'll see a brief "Saving..." indicator
What gets saved:
- Node positions and configurations
- Edge connections
- Canvas zoom level and pan position
- All parameter values
Scope:
- Drafts are organization-scoped
- Only users in your organization can access your drafts
- Drafts are separate from deployed flows
Manual Save
Ctrl+S Keyboard Shortcut
- Press Ctrl+S (or Cmd+S on Mac)
- Your draft saves immediately
- You'll see a confirmation message
Save Button
- Click the "Save" button in the top toolbar
- Your draft saves immediately
- You'll see a confirmation message
Loading Drafts
From the Dashboard:
- Navigate to the Flows or DAGs section
- Look for "Drafts" or "Recent Drafts"
- Click on a draft to open it in the builder
- The canvas loads with your previous configuration
Draft Information:
- Name and description
- Last modified timestamp
- Org-scoped access
- Status indicator (draft/deployed)
Draft Storage
Storage Location:
- Stored as draft records in the database
- Format: JSON
- Indexed by organization and user
Retention:
- Drafts are retained indefinitely
- You can have multiple drafts simultaneously
- Archive old drafts when no longer needed
Deploying from the Builder
Once your DAG is complete and validated, deploying makes it a live, executable flow.
Step 1: Click Deploy
- Click the Deploy button in the top toolbar
- A deployment dialog opens with a form
Step 2: Fill Deployment Metadata
The deployment form is pre-populated from the Flow Configuration Panel when editing existing flows. For new flows, fill in:
Flow Name (Required)
- Descriptive name for your pipeline (e.g., "Customer ETL v1.0")
- Can include version numbers
- Used in logs and dashboards
Version (Auto-generated)
- Semantic versioning (e.g., 1.0.0)
- When editing, the form suggests the next version (e.g., 1.0.1)
- Helps track pipeline evolution
Description (Optional)
- Explain what this flow does
- Note any recent changes
- Helps team members understand the pipeline
Executor (Pre-populated from Flow Config)
- Choose where tasks execute:
- Lambda: Fast, serverless, good for short tasks
- Step Functions: AWS-native orchestration
- ECS: Containerized workloads, better for long-running tasks
- Overridable per-task if needed
Schedule (Optional) Choose how often to run this flow:
- Manual: Only run when triggered explicitly
- One-time: Run a single time at a specified date/time
- Cron Expression: Advanced scheduling (e.g., "0 2 * * *" for 2 AM daily)
- Interval: Repeat every N minutes/hours/days
Example schedules:
0 0 * * *– Every day at midnight0 */6 * * *– Every 6 hours0 9 * * 1-5– Weekdays at 9 AM*/15 * * * *– Every 15 minutes
Tags (Optional)
- Add labels like "production", "customer-data", "v2"
- Help organize and filter flows
- Useful for cost allocation and team assignment
Step 3: Review and Confirm
- Review all metadata for accuracy
- Check that executor choice matches your workload
- Verify the schedule if setting one up
- Click "Deploy" button to proceed
Step 4: Deployment Process
When you click Deploy, the builder:
- Serializes your canvas into FlowSpec JSON format
- Validates the entire specification
- Posts to the Dagy API endpoint
/flows - Creates a Flow record in your organization
- Returns a flow ID and status
Step 5: Post-Deployment
After successful deployment:
- You'll see a success message with the flow ID
- The flow appears in your Flows dashboard
- Execution begins according to your schedule
- You can monitor execution in the Flow Details page
Deployment Failures
If deployment fails:
- Error message explains the issue
- Common causes: Missing required fields, invalid executor, validation errors
- Solution: Review the error, correct the issue, try again
- Drafts are not affected by failed deployments
Editing Existing Flows
The builder allows you to import and modify previously deployed flows, maintaining version control and deployment history.
Edit Flow Workflow
Step 1: Load the Flow
- Navigate to the Flows dashboard
- Click on a deployed flow
- Click "Edit in Builder"
- The builder opens with that flow loaded
What gets pre-populated:
- All task configurations and connections
- Flow name, version, description
- Executor, environment, and other flow-level settings
- Canvas layout and zoom level
- The toolbar shows the flow name with an "Editing" badge
Step 2: Make Changes
- Modify task configurations
- Add or remove tasks
- Adjust connections
- Update flow-level settings via the Flow Config Panel
- Test via validation
Step 3: Version Management
- The deployment form automatically suggests the next version
- Example: 1.0.0 → 1.0.1 (patch) or 1.1.0 (minor)
- Document changes in the description field
Step 4: Re-Deploy
- Click "Deploy" button
- The deployment form is pre-populated with current flow settings
- Update the version and description
- Choose whether to create a new version or update existing
- Execute the updated flow
Auto-Layout When Importing
When importing a FlowSpec JSON, the builder:
- Analyzes the DAG structure
- Calculates optimal node positions
- Applies hierarchical layout algorithm
- Preserves original node positions if included in JSON
- Spacing ensures readability with no node overlaps
Preserving Node Positions
When you export and re-import a FlowSpec JSON with position data:
{
"nodes": [
{
"id": "task1",
"position": {"x": 100, "y": 200}
}
]
}
The builder:
- Uses stored positions instead of auto-layout
- Respects your manual arrangement
- Maintains organization across imports/exports
Editing Drafts
You can also edit unsaved drafts:
- Navigate to Drafts section
- Click on a draft
- Make modifications
- Save changes
- Deploy when ready
Keyboard Shortcuts
Keyboard shortcuts accelerate your workflow in the builder. Commit these to memory for faster development.
Navigation & View
| Shortcut | Action |
|---|---|
| Scroll Wheel | Zoom in/out on canvas |
| Ctrl + A / Cmd + A | Select all nodes and edges |
| Click + Drag (empty) | Pan canvas (move around) |
| Home | Fit entire DAG in view |
Editing
| Shortcut | Action |
|---|---|
| Ctrl + Z / Cmd + Z | Undo last action |
| Ctrl + Shift + Z / Cmd + Shift + Z | Redo last action |
| Delete / Backspace | Delete selected node(s) or edge(s) |
| Ctrl + D / Cmd + D | Duplicate selected node |
| Ctrl + X / Cmd + X | Cut selected node(s) |
| Ctrl + C / Cmd + C | Copy selected node(s) |
| Ctrl + V / Cmd + V | Paste copied node(s) |
Saving & Validation
| Shortcut | Action |
|---|---|
| Ctrl + S / Cmd + S | Save draft |
| Ctrl + Shift + S / Cmd + Shift + S | Save with name dialog |
| Ctrl + Enter / Cmd + Enter | Validate and show report |
Node Selection
| Shortcut | Action |
|---|---|
| Click | Select single node/edge |
| Shift + Click | Add to selection |
| Click + Drag | Select multiple nodes (box select) |
| Escape | Deselect all |
Best Practices
Following these best practices will make your DAGs more maintainable, reliable, and efficient.
Naming Conventions
Descriptive Task Names
- Use clear, action-oriented names
- Good: "Load Customer Data", "Filter Active Users", "Send Slack Alert"
- Avoid: "Task 1", "Process", "Step A"
Naming Consistency
- Use consistent patterns across your org
- Example: verb + object: "Load_Data", "Transform_Data", "Export_Data"
- Example: action + system: "FetchFrom_API", "WriteTo_Database"
Unique Names
- Every task must have a unique name within its DAG
- Names become identifiers in logs and monitoring
DAG Complexity
Keep DAGs Simple
- Recommend: < 20 nodes per DAG
- Benefits: easier to understand, faster to execute, simpler debugging
- If exceeding 20 nodes, consider splitting into multiple flows
Logical Grouping
- Group related tasks together visually
- Use descriptive names to show relationships
- Organize left-to-right: ingestion → transform → export
Avoid Deep Nesting
- Deep chains (100+ sequential tasks) are hard to debug
- Break into separate flows with intermediate storage
- Use parallel branches instead of sequential when possible
Validation & Testing
Always Validate Before Deploy
- Click "Validation" button
- Fix all errors and review warnings
- Ensure no orphaned nodes or cycles
Test Locally First
- Export as FlowSpec JSON
- Review in the Code tab to see generated
@taskdecorators - Verify your import paths are correct
- Test with sample data similar to production
Version Semantically
- Use semantic versioning: MAJOR.MINOR.PATCH
- MAJOR: breaking changes
- MINOR: new features
- PATCH: bug fixes
- Example: 1.2.3 → 1.2.4 (patch), 1.3.0 (minor), 2.0.0 (major)
Configuration
Timeout Settings
- Always set timeouts for external API calls
- Default: None (unlimited)
- Recommended: API calls 30-60s, file operations 300s
- Prevents hung tasks from blocking pipeline
Retry Logic
- Use retries for flaky operations (APIs, network)
- Avoid retries for deterministic failures
- Set retry_delay for rate-limited APIs
- Recommended: 2-3 retries for transient errors
Concurrency Limits
- Set to 1 for serial, state-dependent tasks
- Set to 5-10 for parallel, independent tasks
- Consider downstream system capacity
- Monitor resource usage during execution
Execution Planning
Choose the Right Executor
- Lambda: < 5 min, < 512MB, stateless tasks
- Step Functions: AWS-native workflows, state machines
- ECS: Long-running (> 15 min), memory-intensive, containerized
Schedule Appropriately
- Off-peak hours for heavy operations
- Check downstream system availability
- Consider timezone implications
- Monitor execution history
Monitor and Alert
- Set up notifications for failures
- Monitor execution time trends
- Alert on resource anomalies
- Review logs regularly
Documentation
Describe Your Flows
- Add meaningful descriptions to each task
- Explain complex transformations
- Document custom parameters
- Help future maintainers understand intent
Document Changes
- Use version descriptions to document updates
- Note what changed and why
- Reference related tickets or PRs
- Maintain changelog
Troubleshooting
Common Issues and Solutions
Canvas Not Loading
Problem: Flow Builder page stays blank or shows loading spinner indefinitely.
Causes:
- Browser JavaScript disabled
- React Flow CSS not loaded
- Network connection issue
- Browser console errors
Solutions:
- Check browser console (F12 → Console tab)
- Look for JavaScript errors
- Verify React Flow CSS is loading (Network tab)
- Try refreshing the page (Ctrl+R / Cmd+R)
- Clear browser cache and try again
- Try a different browser to isolate issues
Debug steps:
// In browser console, check if React Flow is loaded:
console.log(typeof ReactFlow)
// Should print "object", not "undefined"
Nodes Not Connecting
Problem: Can't create edges between nodes; drag operation doesn't work.
Causes:
- Data type incompatibility: The source connector's data types don't match the target connector's accepted types
- Cardinality exceeded: The connector already has the maximum number of connections
- Dragging from an inbound connector to another inbound connector (must be outbound → inbound)
- Trying to connect a node to itself
- Handles not properly rendered
Solutions:
- Hover over both connectors to check their data types — they must overlap
- Verify you're dragging from a BOTTOM handle (outbound) to a TOP handle (inbound)
- Check the browser console (F12) for specific rejection messages (e.g., "Connection rejected: No compatible data types")
- If a connector has a cardinality of "one", disconnect the existing edge first
- Try zooming in to see individual connector handles
- Refresh the page if handles don't appear
Visual indicators:
- Outbound handles (bottom): Connect these as sources
- Inbound handles (top): Connect these as targets
- Hovering shows a tooltip with connector name, data types, and required status
Validation Errors
Problem: Validation shows errors you don't know how to fix.
Causes:
- Missing required fields (name, import_path)
- Import path doesn't exist in your codebase
- Duplicate task names
- Circular dependencies
Solutions by error type:
"Name is required"
- Click the affected node
- Enter a name in the "Name" field
- Name must be unique and descriptive
"Cannot resolve import_path"
- Verify the module exists in your Python path
- Check spelling:
module_name:function_name - Ensure function is exported and accessible
- Test import locally:
from module_name import function_name
"Duplicate task name"
- Click each node with that name
- Rename to unique values
- Use suffixes if needed: "Process Data v1", "Process Data v2"
"Circular dependency detected"
- Review your edge connections
- Look for A → B → ... → A paths
- Remove the edge that completes the circle
- Use the canvas to visualize the cycle
Deploy Failures
Problem: Deploy button returns an error.
Causes:
- Validation errors exist
- Required metadata missing
- Invalid executor choice
- API connectivity issues
- Permission/authentication issues
Solutions:
- Run Validation button first
- Fix any reported errors
- Fill in all required deployment fields
- Check network connectivity
- Verify your organization has permission to create flows
- Check API status page
- Try again in a few moments
Performance Issues
Problem: Canvas is slow, zooming is laggy, selection is delayed.
Causes:
- Very large DAG (100+ nodes)
- Browser has low memory
- Too many browser tabs open
- GPU acceleration disabled
Solutions:
- Split large DAG into smaller flows
- Close other browser tabs
- Try a different browser
- Check browser performance in DevTools
- Zoom out to see entire DAG
- Restart browser if persistent
Lost Work
Problem: Changes not saved or draft disappeared.
Causes:
- Didn't click Save button
- Auto-save didn't complete
- Browser crashed before save
- Cleared browser cache/storage
- Network disconnection during save
Prevention:
- Manually save frequently (Ctrl+S)
- Check for "Saving..." indicator
- Monitor auto-save status
- Avoid closing tab without saving
- Use version control for exports
Recovery:
- Check browser history for draft URL
- Contact support for database recovery
- Check backup/snapshot if available
- Export from deployed flow if one exists
Import Issues
Problem: Can't import FlowSpec JSON or import fails.
Causes:
- Invalid JSON format
- Missing required fields in JSON
- Version mismatch
- File too large
- Corrupt file
Solutions:
- Validate JSON format (use online JSON validator)
- Verify all required fields present: name, nodes, edges
- Check file size (should be < 10MB)
- Re-export from original source
- Try manual import instead of drag-drop
- Recreate in builder manually if needed
Valid FlowSpec example:
{
"name": "My Flow",
"version": "1.0.0",
"nodes": [],
"edges": []
}
Getting Help
Where to find help:
- Documentation: Browse the Dagy docs site
- API Reference: Detailed endpoint documentation
- Community Forum: Ask questions and share solutions
- Support: Contact the Dagy support team
Happy building! The Flow Builder makes it easy to create powerful data pipelines without writing code. Start simple, iterate frequently, and leverage the validator to catch issues early.