Learn

Learn

From Task Orchestration to DAG Thinking: What Data Engineers Need to Unlearn When Adopting dbt™

Learn why data engineers from Airflow and other orchestrators struggle with dbt™. Discover the 3 key principles for leveraging dbt's™ DAG engine, project structure, and parallelism instead of manual task orchestration.

Fabio Di Leta

·

Nov 4, 2025

·

5

min read

TL;DR: If you're treating dbt™ like Airflow with individual dbt run --select commands for each model, you're missing the point. This guide shows you how to leverage dbt's™ DAG engine, project structure, and tags to enable parallelism and eliminate manual orchestration.

The Anti-Pattern: Treating dbt™ Like a Task Orchestrator

If you're coming from tools like Airflow, Prefect, or other task-based orchestrators, you might be tempted to treat dbt™ the same way. I've seen this anti-pattern countless times:

# ❌ The Airflow mindset in dbt™ - DON'T DO THIS
dbt run --select order_model
dbt run --select finance_model
dbt run --select customer_model
dbt run --select product_model
dbt run --select

Why is this wrong? This approach completely misses the point of dbt™. You're essentially fighting against one of dbt's™ most powerful features: its DAG (Directed Acyclic Graph) engine.

Understanding the Mental Model Shift

What You're Used To: Task Orchestration (Airflow)

In Airflow, you explicitly define:

  • What runs

  • When it runs

  • What order it runs in

  • How tasks depend on each other

You're the conductor, micromanaging every instrument.

What dbt™ Does: Define the DAG

When you write dbt™ models, you're not writing tasks to be orchestrated—you're defining transformations and their dependencies.

In dbt™, you simply define:

  • What transformations exist

  • What data they depend on (via ref() and source())

Then dbt™ automatically:

  • Determines execution order

  • Runs independent models in parallel

  • Handles failures gracefully

  • Builds the entire dependency graph

You're the composer, not the conductor.

Key Principle #1: Invest Early in Project Structure

The single most important thing you can do when starting with dbt™ is invest time upfront in your project structure. This pays dividends immediately and prevents the need to manually orchestrate models.

Use a Layered Architecture

Organize your dbt™ project using a proven layered approach:


Tag Strategically from Day One

Tags are the secret to eliminating manual model orchestration. They let you run entire categories of models without naming each one individually.

Here's how to configure tags in your dbt™ project:

# models/marts/finance/_finance.yml
models:
  - name: fct_orders
    config:
      tags: ['finance', 'daily']

  - name: dim_products
    config:
      tags: ['finance', 'daily']

# models/marts/marketing/_marketing.yml
models:
  - name: fct_campaigns
    config:
      tags: ['marketing', 'hourly'

Before (manual orchestration - 47 commands):

dbt run --select fct_orders
dbt run --select dim_products
dbt run --select fct_campaigns
# ... and 44 more models

After (tag-based - 1 command):

dbt run --select tag:finance    # Runs all finance models
dbt run --select tag:marketing  # Runs all marketing models
dbt run --select tag:daily      # Runs all daily refresh models

Enable parallelism automatically:

dbt run --select tag:finance --threads 8

dbt™ will automatically run independent models in parallel. No manual dependency management required. The number of parallel execution dbt™ will executed is defined by the threads configuration. You will normally this is set in your profiles.yml file, but you can also override this in the dbt™ command itself

Key Principle #2: Let dbt™ Manage Dependencies

This is the hardest mental shift for Airflow veterans: stop manually defining execution order.

❌ Don't Do This (Manual Orchestration)

# Manually orchestrating in the wrong layer
dbt run --select stg_orders
dbt run --select stg_customers
dbt run --select int_order_items_joined  # depends on stg_orders
dbt run --select fct_orders              # depends on int_order_items_joined

✅ Do This Instead (Let dbt™ Handle It)

# Let dbt™ figure it out
dbt run --select +fct_orders

# Or run everything in finance
dbt run --select

How does this work? The + prefix tells dbt™: "Run this model and all its upstream dependencies." dbt™ crawls the ref() functions in your models and builds the DAG automatically.

Key Principle #3: Think in Layers, Not Tasks

Structure your dbt™ project in layers that represent your transformation logic:

Staging Layer (tag:staging)

  • One model per source table

  • Light transformations (renaming, casting, basic filtering)

  • No joins, no business logic

Intermediate Layer (tag:intermediate)

  • Reusable components

  • Complex joins

  • Business logic that's used in multiple places

Marts Layer (tag:marts)

  • Final tables for analytics

  • Optimized for query performance

  • Business-friendly naming

Running Layers

dbt run --select tag:staging    # Fast, runs in parallel
dbt run --select tag:intermediate
dbt run --select

Or even simpler:

dbt build  # Runs everything in dependency order

Common Pitfalls When Transitioning to dbt™

1. Over-orchestrating

Problem: Creating Airflow tasks for every dbt™ model

Solution: Executed dbt™ model for logical groups (tags, folders, layers)

2. Not using ref()

Problem: Hardcoding table names instead of using {{ ref('model_name') }}

Solution: Always use ref() and source() so dbt™ can build the DAG

3. Running models serially

Problem: dbt run --select model1; dbt run --select model2; dbt run --select model3

Solution: dbt run --select model1 model2 model3 or use tags/folders – a lot better

4. Ignoring project structure

Problem: All models in one folder with no organization

Solution: Invest in folder structure and naming conventions early

5. Tags as an afterthought

Problem: Adding tags after you have 200 models

Solution: Define your tagging strategy on day one

Practical Tagging Strategy for dbt™

Here's a starter tagging strategy that works well for most dbt™ projects:

# Layer tags (mutually exclusive)
- staging
- intermediate
- marts

# Domain tags (business area)
- finance
- marketing
- operations
- product

# Refresh frequency tags
- real_time
- hourly
- daily
- weekly

# Special purpose tags

Powerful Tag Combinations

# All daily finance models
dbt run --select tag:daily,tag:finance

# Everything except deprecated models
dbt run --exclude tag:deprecated

# All staging and intermediate layers (for quick iteration)
dbt run --select

Summary: 5 Principles for Transitioning to dbt™

When transitioning from task-based orchestrators to dbt™ (whether using dbt Core™ or dbt Cloud™ or Paradime.io):

  1. Stop thinking in tasks, start thinking in transformations - Declare what you want, not how to get there

  2. Invest time in project structure early - It's 80% of your success with dbt™

  3. Use tags liberally - They're your gateway to parallelism and clean orchestration

  4. Let dbt™ manage the DAG - That's literally what it's built for

  5. Use selection syntax - Stop naming individual models

FAQ: dbt™ for Airflow Users

Q: Should I create an Airflow task for each dbt™ model?

A: No. Create Airflow tasks for logical groups using tags, folders, or layers. Let dbt™ handle the model-level dependencies.

Q: How do I control execution order in dbt™?

A: You don't need to. Use ref() in your models, and dbt™ automatically builds the DAG and determines execution order.

Q: How does dbt™ handle parallelism?

A: dbt™ automatically runs independent models in parallel. Use the --threads flag to control the degree of parallelism.

Q: When should I use tags vs folders in dbt™?

A: Use folders for structural organization (staging, marts, etc.) and tags for cross-cutting concerns (refresh frequency, domains, PII, etc.).

Q: Can I use both dbt™ and Airflow?

A: Yes! Use Airflow for job scheduling and cross-tool orchestration (Even better, user Paradime.io 😀), but let dbt™ handle the data transformation DAG.

Go Deeper: Master dbt™ Selection Syntax

Look, if you're still running models one by one, you're doing it wrong. Here's what you need to read:

  • dbt™ CLI Commands - Everything about run, build, test, and more. Stop guessing which command to use.

  • Selection Methods - The complete guide to selection syntax. Tags, folders, sources - all the ways to target your models.

  • Selector Methods Deep Dive - Advanced patterns for combining selectors. This is where things get powerful.

  • Graph Operators - Master the +, @, and other operators. Control your DAG like a pro.

These guides will fix your dbt™ workflow. Read them.

Interested to Learn More?
Try Out the Free 14-Days Trial

More Articles

Experience Analytics for the AI-Era

Start your 14-day trial today - it's free and no credit card needed

Experience Analytics for the AI-Era

Start your 14-day trial today - it's free and no credit card needed

Experience Analytics for the AI-Era

Start your 14-day trial today - it's free and no credit card needed

Copyright © 2025 Paradime Labs, Inc.

Made with ❤️ in San Francisco ・ London

*dbt® and dbt Core® are federally registered trademarks of dbt Labs, Inc. in the United States and various jurisdictions around the world. Paradime is not a partner of dbt Labs. All rights therein are reserved to dbt Labs. Paradime is not a product or service of or endorsed by dbt Labs, Inc.

Copyright © 2025 Paradime Labs, Inc.

Made with ❤️ in San Francisco ・ London

*dbt® and dbt Core® are federally registered trademarks of dbt Labs, Inc. in the United States and various jurisdictions around the world. Paradime is not a partner of dbt Labs. All rights therein are reserved to dbt Labs. Paradime is not a product or service of or endorsed by dbt Labs, Inc.

Copyright © 2025 Paradime Labs, Inc.

Made with ❤️ in San Francisco ・ London

*dbt® and dbt Core® are federally registered trademarks of dbt Labs, Inc. in the United States and various jurisdictions around the world. Paradime is not a partner of dbt Labs. All rights therein are reserved to dbt Labs. Paradime is not a product or service of or endorsed by dbt Labs, Inc.