Mastering dbt Incremental Models: A Practical Guide

Feb 26, 2026

Table of Contents

Mastering dbt Incremental Models: A Practical Guide

Every analytics engineer eventually hits the same wall: a dbt™ model that ran in two minutes suddenly takes forty-five. Source tables grow, transformations compound, and warehouse bills climb. Rebuilding millions of rows each run when only a few thousand changed is wasteful—and avoidable.

dbt™ incremental models solve this by processing only new or changed data instead of rebuilding entire tables. They are the single most impactful optimization you can make to a dbt™ pipeline once table sizes cross the threshold where full rebuilds become costly or slow.

This guide walks you through everything you need to configure, operate, and troubleshoot dbt™ incremental models—from choosing the right incremental strategy to handling schema drift with on_schema_change, applying incremental_predicates on billion-row tables, and knowing when to reach for --full-refresh.

What Are dbt Incremental Models

An incremental model is a dbt™ materialization type that appends or updates only new and changed rows in a target table, rather than dropping and recreating the entire table on every run. It is one of three core dbt™ materializations, each designed for different use cases:

Incremental materialization: Only processes new or changed rows since the last successful run. The target table persists between runs and grows over time.
Table materialization: Drops and recreates the entire table from scratch every run. Simple and accurate, but expensive for large datasets.
View materialization: Stores only the SQL query logic—no data is physically persisted. Lightweight but slow to query at scale.

The incremental materialization sits between view (no compute cost, slow queries) and table (high compute cost, fast queries). It gives you the query performance of a table with a fraction of the per-run compute.

Think of it this way: a table materialization rebuilds your entire house every morning. An incremental model only installs the new furniture that arrived overnight.

Why Use Incremental Materialization in dbt

Incremental models exist for one reason: efficiency at scale. When your source data grows into millions or billions of rows, rebuilding everything on every run becomes unsustainable.

Here are the practical benefits:

Reduced compute costs: Processing 10,000 new rows instead of 10 million means dramatically lower warehouse spend on Snowflake, BigQuery, Databricks, or Redshift.
Faster pipeline execution: Smaller data volumes mean quicker builds—turning a 45-minute model run into a 2-minute incremental update.
Lower warehouse load: Less strain on your cluster, freeing up resources for analysts running queries and dashboards.
Reduced strain on source systems: Pulling only recent data from upstream sources instead of performing full extracts.

When Incremental Models Make Sense

Incremental models pay off when:

Source tables have millions or billions of rows
Your pipeline runs frequently (hourly, every 15 minutes)
Transformations involve expensive computations (regex, UDFs, complex joins)
Warehouse costs are material and growing

When They Add Unnecessary Complexity

For small tables (under a few hundred thousand rows), table materialization rebuilds quickly and eliminates the risk of incremental edge cases. The added configuration—unique keys, filter logic, schema change handling—often isn't worth it when a full rebuild takes under a minute.

How dbt Incremental Models Work

The mechanics of an incremental model follow a straightforward two-phase pattern:

flowchart TD
    A[dbt run triggered] --> B{Does target table exist?}
    B -- No --> C[First Run: Build full table from all source data]
    B -- Yes --> D{--full-refresh flag?}
    D -- Yes --> C
    D -- No --> E[Incremental Run: Query source with is_incremental filter]
    E --> F[Identify new/changed rows only]
    F --> G[Apply incremental strategy: merge / append / delete+insert]
    G --> H[Target table updated]

flowchart TD
    A[dbt run triggered] --> B{Does target table exist?}
    B -- No --> C[First Run: Build full table from all source data]
    B -- Yes --> D{--full-refresh flag?}
    D -- Yes --> C
    D -- No --> E[Incremental Run: Query source with is_incremental filter]
    E --> F[Identify new/changed rows only]
    F --> G[Apply incremental strategy: merge / append / delete+insert]
    G --> H[Target table updated]

flowchart TD
    A[dbt run triggered] --> B{Does target table exist?}
    B -- No --> C[First Run: Build full table from all source data]
    B -- Yes --> D{--full-refresh flag?}
    D -- Yes --> C
    D -- No --> E[Incremental Run: Query source with is_incremental filter]
    E --> F[Identify new/changed rows only]
    F --> G[Apply incremental strategy: merge / append / delete+insert]
    G --> H[Target table updated]

How dbt™ decides between a full build and an incremental update on each run.

Here is the step-by-step flow:

First run: dbt™ creates the target table with all source data—identical to a table materialization. No filtering is applied because there is no existing data to compare against.
Subsequent runs: dbt™ checks that the target table exists and the -full-refresh flag was not passed. It then evaluates your filter logic (wrapped in is_incremental()) to identify only new or changed rows.
Strategy application: dbt™ applies your chosen incremental strategy—merge, append, delete+insert, or insert_overwrite—to combine the new rows with the existing target table.

The filter condition that determines which rows are "new" is sometimes called a watermark. The most common watermark is a timestamp column like updated_at or created_at, where you select rows newer than the maximum value already in your target table.

dbt Incremental Strategies

An incremental strategy defines how dbt™ combines new data with existing data in the target table. The strategy you choose affects performance, correctness, and which data warehouses support it.

Strategy	How It Works	Best For
`append`	Inserts new rows without checking for duplicates	Event logs, immutable data
`merge`	Upserts rows based on a unique key	Dimension tables, mutable records
`delete+insert`	Deletes matching rows, then inserts replacements	Large batch updates
`insert_overwrite`	Replaces entire partitions with new data	Partitioned fact tables
`microbatch`	Processes data in small time-based batches	High-volume streaming data

Strategy availability varies by adapter. Here is the support matrix:

Adapter	`append`	`merge`	`delete+insert`	`insert_overwrite`	`microbatch`
dbt-snowflake	✅	✅	✅	✅	✅
dbt-bigquery	✅	✅	—	✅	—
dbt-databricks	✅	✅	✅	✅	✅
dbt-postgres	✅	✅	✅	—	✅
dbt-redshift	✅	✅	✅	—	✅

Source: dbt™ incremental strategy docs

Append

The append strategy inserts new rows directly into the target table without checking for duplicates or updating existing records. It is the simplest strategy and requires no unique_key.

{{
  config(
    materialized='incremental',
    incremental_strategy='append'
  )
}}

select *
from {{ ref('stg_events') }}
{% if is_incremental() %}
  where event_timestamp > (select max(event_timestamp) from {{ this }})
{% endif %}

{{
  config(
    materialized='incremental',
    incremental_strategy='append'
  )
}}

select *
from {{ ref('stg_events') }}
{% if is_incremental() %}
  where event_timestamp > (select max(event_timestamp) from {{ this }})
{% endif %}

{{
  config(
    materialized='incremental',
    incremental_strategy='append'
  )
}}

select *
from {{ ref('stg_events') }}
{% if is_incremental() %}
  where event_timestamp > (select max(event_timestamp) from {{ this }})
{% endif %}

Best for: Immutable event streams, click logs, page views—any data where records are never updated after creation. Since there is no deduplication, ensure your source data and filter logic prevent the same event from being inserted twice.

Merge

The merge strategy performs an upsert: it inserts rows with new unique keys and updates rows with existing unique keys. This is the dbt™ merge strategy most teams reach for first.

{{
  config(
    materialized='incremental',
    unique_key='order_id',
    incremental_strategy='merge'
  )
}}

select
  order_id,
  status,
  updated_at
from {{ ref('stg_orders') }}
{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

{{
  config(
    materialized='incremental',
    unique_key='order_id',
    incremental_strategy='merge'
  )
}}

select
  order_id,
  status,
  updated_at
from {{ ref('stg_orders') }}
{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

{{
  config(
    materialized='incremental',
    unique_key='order_id',
    incremental_strategy='merge'
  )
}}

select
  order_id,
  status,
  updated_at
from {{ ref('stg_orders') }}
{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

Best for: Dimension tables, order tables, user profiles—any dataset where records can change over time and you need the target to reflect the latest state. The merge strategy implements SCD Type 1 logic (overwrite with latest values).

Performance note: Merge performs well on tables up to ~50–100M rows. Beyond that, consider switching to delete+insert with partition pruning for better performance.

Delete+Insert

The delete+insert strategy first deletes all rows in the target table that match the unique_key of incoming rows, then inserts the new rows. It separates the update into two distinct SQL operations.

{{
  config(
    materialized='incremental',
    unique_key='order_id',
    incremental_strategy='delete+insert'
  )
}}

select
  order_id,
  status,
  updated_at
from {{ ref('stg_orders') }}
{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

{{
  config(
    materialized='incremental',
    unique_key='order_id',
    incremental_strategy='delete+insert'
  )
}}

select
  order_id,
  status,
  updated_at
from {{ ref('stg_orders') }}
{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

{{
  config(
    materialized='incremental',
    unique_key='order_id',
    incremental_strategy='delete+insert'
  )
}}

select
  order_id,
  status,
  updated_at
from {{ ref('stg_orders') }}
{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

Best for: Large batch updates and reprocessing scenarios where you want to fully replace matching records. Benchmarks show delete+insert can run 60–80% faster than merge on tables with 100M+ rows, especially when combined with partition pruning.

Trade-off: Unlike merge, delete+insert is not fully atomic—there is a brief intermediate state between the delete and insert operations. Most warehouses handle this within a transaction, but it is worth understanding.

Insert_Overwrite

The insert_overwrite strategy replaces entire partitions in the target table with new data. It requires the table to be partitioned (typically by date).

{{
  config(
    materialized='incremental',
    incremental_strategy='insert_overwrite',
    partition_by={
      "field": "event_date",
      "data_type": "date",
      "granularity": "day"
    }
  )
}}

select *
from {{ ref('stg_page_views') }}
{% if is_incremental() %}
  where event_date >= date_sub(current_date, interval 3 day)
{% endif %}

{{
  config(
    materialized='incremental',
    incremental_strategy='insert_overwrite',
    partition_by={
      "field": "event_date",
      "data_type": "date",
      "granularity": "day"
    }
  )
}}

select *
from {{ ref('stg_page_views') }}
{% if is_incremental() %}
  where event_date >= date_sub(current_date, interval 3 day)
{% endif %}

{{
  config(
    materialized='incremental',
    incremental_strategy='insert_overwrite',
    partition_by={
      "field": "event_date",
      "data_type": "date",
      "granularity": "day"
    }
  )
}}

select *
from {{ ref('stg_page_views') }}
{% if is_incremental() %}
  where event_date >= date_sub(current_date, interval 3 day)
{% endif %}

Best for: Date-partitioned fact tables on BigQuery, Spark, or Databricks where you want to overwrite entire days of data. Highly efficient because only affected partitions are touched—unrelated partitions remain untouched.

Microbatch

The dbt™ microbatch strategy is a newer approach that processes data in small, time-based batches. Instead of running a single query for all new data, dbt™ splits the work into multiple independent queries—one per time period (e.g., one per day or hour).

{{
  config(
    materialized='incremental',
    incremental_strategy='microbatch',
    event_time='session_start',
    begin='2020-01-01',
    batch_size='day'
  )
}}

select *
from {{ ref('stg_sessions') }}

{{
  config(
    materialized='incremental',
    incremental_strategy='microbatch',
    event_time='session_start',
    begin='2020-01-01',
    batch_size='day'
  )
}}

select *
from {{ ref('stg_sessions') }}

{{
  config(
    materialized='incremental',
    incremental_strategy='microbatch',
    event_time='session_start',
    begin='2020-01-01',
    batch_size='day'
  )
}}

select *
from {{ ref('stg_sessions') }}

Key differences from other strategies:

You do not need to write is_incremental() filter logic—dbt™ automatically filters based on event_time and batch_size.
Each batch is independent and idempotent—failed batches can be retried without affecting others.
Supports parallel batch execution for faster processing.
Use -event-time-start and -event-time-end flags for targeted historical backfills.

Best for: High-volume time-series data, near-real-time pipelines, and scenarios where reliability is critical (failed batches don't block the entire run).

How to Configure a dbt Incremental Model

Setting up an incremental model involves five steps. Each builds on the previous one.

flowchart LR
    A["1. Set materialized=\\n'incremental'"] --> B["2. Define\\nunique_key"]
    B --> C["3. Add is_incremental()\\nfilter logic"]
    C --> D["4. Choose incremental\\nstrategy"]
    D --> E["5. Test and\\nvalidate"]

flowchart LR
    A["1. Set materialized=\\n'incremental'"] --> B["2. Define\\nunique_key"]
    B --> C["3. Add is_incremental()\\nfilter logic"]
    C --> D["4. Choose incremental\\nstrategy"]
    D --> E["5. Test and\\nvalidate"]

flowchart LR
    A["1. Set materialized=\\n'incremental'"] --> B["2. Define\\nunique_key"]
    B --> C["3. Add is_incremental()\\nfilter logic"]
    C --> D["4. Choose incremental\\nstrategy"]
    D --> E["5. Test and\\nvalidate"]

The five configuration steps for a dbt™ incremental model.

1. Set the Materialization to Incremental

Tell dbt™ this model should be materialized incrementally by adding the config block at the top of your model file:

{{
  config(
    materialized='incremental'
  )
}}

select *
from {{ ref('stg_orders') }}

{{
  config(
    materialized='incremental'
  )
}}

select *
from {{ ref('stg_orders') }}

{{
  config(
    materialized='incremental'
  )
}}

select *
from {{ ref('stg_orders') }}

You can also set this in dbt_project.yml to apply it to an entire directory of models:

models:
  my_project:
    marts:
      +materialized

models:
  my_project:
    marts:
      +materialized

models:
  my_project:
    marts:
      +materialized

2. Define a Unique Key

The unique_key tells dbt™ which column(s) identify a unique record. This enables merge/upsert behavior—without it, dbt™ can only append rows.

Single column key:

{{
  config(
    materialized='incremental',
    unique_key='order_id'
  )
}}

{{
  config(
    materialized='incremental',
    unique_key='order_id'
  )
}}

{{
  config(
    materialized='incremental',
    unique_key='order_id'
  )
}}

Composite key (multiple columns):

{{
  config(
    materialized='incremental',
    unique_key=['user_id', 'session_number']
  )
}}

{{
  config(
    materialized='incremental',
    unique_key=['user_id', 'session_number']
  )
}}

{{
  config(
    materialized='incremental',
    unique_key=['user_id', 'session_number']
  )
}}

Important: Columns used as unique keys should not contain NULL values. When any unique key column is NULL, dbt™ will insert a new row instead of updating the existing one, leading to duplicates.

3. Add Incremental Logic with is_incremental()

The is_incremental() macro wraps your filter logic so it only applies during incremental runs—not on the first run or when --full-refresh is used.

{{
  config(
    materialized='incremental',
    unique_key='order_id'
  )
}}

select
  order_id,
  status,
  updated_at
from {{ ref('stg_orders') }}
{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

{{
  config(
    materialized='incremental',
    unique_key='order_id'
  )
}}

select
  order_id,
  status,
  updated_at
from {{ ref('stg_orders') }}
{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

{{
  config(
    materialized='incremental',
    unique_key='order_id'
  )
}}

select
  order_id,
  status,
  updated_at
from {{ ref('stg_orders') }}
{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

The {{ this }} Jinja function refers to the current model's existing target table in the warehouse. It lets you query the already-materialized data to determine the watermark.

4. Choose an Incremental Strategy

Set the incremental_strategy in your config block. If you don't specify one, dbt™ uses the default for your adapter (typically merge for most warehouses).

{{
  config(
    materialized='incremental',
    unique_key='order_id',
    incremental_strategy='delete+insert'
  )
}}

{{
  config(
    materialized='incremental',
    unique_key='order_id',
    incremental_strategy='delete+insert'
  )
}}

{{
  config(
    materialized='incremental',
    unique_key='order_id',
    incremental_strategy='delete+insert'
  )
}}

Refer to the strategies comparison table above to select the right strategy for your data characteristics and warehouse.

5. Test and Validate Your Model

After configuration, validate that your incremental model works correctly:

# First run — builds the full table
dbt run --select my_incremental_model

# Second run — should process only new rows
dbt run --select my_incremental_model

# Verify no duplicates
dbt test --select

# First run — builds the full table
dbt run --select my_incremental_model

# Second run — should process only new rows
dbt run --select my_incremental_model

# Verify no duplicates
dbt test --select

# First run — builds the full table
dbt run --select my_incremental_model

# Second run — should process only new rows
dbt run --select my_incremental_model

# Verify no duplicates
dbt test --select

Check that:

Row counts increase as expected (not doubling due to duplicates)
The unique_key constraint holds—run a dbt_utils.unique_combination_of_columns test if using composite keys
Run times decrease significantly compared to a full table rebuild

How to Use dbt is_incremental()

The is_incremental() Jinja macro is the gatekeeper of your incremental logic. It returns True only when all three conditions are met:

The model is configured with materialized='incremental'
The target table already exists in the database
The -full-refresh flag was not passed

On the first run, is_incremental() returns False because the target table doesn't exist yet—so dbt™ builds the full table. On subsequent runs, it returns True, and your filter logic kicks in.

Common is_incremental() Patterns

Timestamp filter (most common):

{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

ID-based filter (for auto-incrementing keys):

{% if is_incremental() %}
  where id > (select max(id) from {{ this }})
{% endif %}

{% if is_incremental() %}
  where id > (select max(id) from {{ this }})
{% endif %}

{% if is_incremental() %}
  where id > (select max(id) from {{ this }})
{% endif %}

Lookback window (handles late-arriving data):

{% if is_incremental() %}
  where event_date >= dateadd(day, -3, current_date)
{% endif %}

{% if is_incremental() %}
  where event_date >= dateadd(day, -3, current_date)
{% endif %}

{% if is_incremental() %}
  where event_date >= dateadd(day, -3, current_date)
{% endif %}

Timestamp with lookback buffer (recommended best practice):

{% if is_incremental() %}
  where updated_at >= (select dateadd(day, -3, max(updated_at)) from {{ this }})
{% endif %}

{% if is_incremental() %}
  where updated_at >= (select dateadd(day, -3, max(updated_at)) from {{ this }})
{% endif %}

{% if is_incremental() %}
  where updated_at >= (select dateadd(day, -3, max(updated_at)) from {{ this }})
{% endif %}

The lookback buffer is strongly recommended by the dbt™ best practices guide. It re-processes the last few days of data to catch records that arrived late. Combined with a unique_key, existing rows get updated instead of duplicated—you process slightly more data but avoid silent data gaps.

Handling dbt on_schema_change

Source schemas evolve. Columns get added, renamed, or removed. The on_schema_change config option controls how dbt™ handles schema drift between your model's current SQL output and the existing target table.

{{
  config(
    materialized='incremental',
    unique_key='order_id',
    on_schema_change='sync_all_columns'
  )
}}

{{
  config(
    materialized='incremental',
    unique_key='order_id',
    on_schema_change='sync_all_columns'
  )
}}

{{
  config(
    materialized='incremental',
    unique_key='order_id',
    on_schema_change='sync_all_columns'
  )
}}

Or set it in dbt_project.yml:

models:
  my_project:
    +on_schema_change: "append_new_columns"

models:
  my_project:
    +on_schema_change: "append_new_columns"

models:
  my_project:
    +on_schema_change: "append_new_columns"

on_schema_change Options

Option	Behavior
`ignore`	Ignores schema changes (default)
`fail`	Fails the run if schema changes are detected
`append_new_columns`	Adds new columns, ignores removed columns
`sync_all_columns`	Adds new columns and removes missing columns

ignore

The default behavior. If your model SQL now selects a new column, it silently won't appear in the target table. Removed columns won't cause an error. This is safe but can lead to silent data loss—you think a column is being captured, but it is not.

fail

Triggers an error when the source and target schemas diverge. This forces a manual review—you must explicitly decide how to handle the change (usually by running --full-refresh). Good for teams that want strict control over schema evolution.

append_new_columns

Automatically adds new columns to the target table. Existing rows will have NULL for the new columns (dbt™ does not backfill). Columns that no longer exist in the model SQL remain in the target table. This is a good middle ground for most teams.

sync_all_columns

The most aggressive option—adds new columns and removes columns that no longer exist in the model SQL. Use this carefully, as dropping columns from your target table is irreversible without a full refresh.

Note: None of these options backfill values in existing rows for newly added columns. To populate historical data for a new column, run --full-refresh.

When to Run dbt --full-refresh

Incremental models inevitably drift from source data over time. Late-arriving facts, logic changes, and schema updates all create scenarios where you need to rebuild the entire table from scratch.

# Full refresh a specific model and its downstream dependencies
dbt build --full-refresh --select my_incremental_model+

# Full refresh just the model
dbt run --full-refresh --select

# Full refresh a specific model and its downstream dependencies
dbt build --full-refresh --select my_incremental_model+

# Full refresh just the model
dbt run --full-refresh --select

# Full refresh a specific model and its downstream dependencies
dbt build --full-refresh --select my_incremental_model+

# Full refresh just the model
dbt run --full-refresh --select

Scenarios Requiring Full Refresh

Model logic changed: You updated the transformation SQL—existing rows in the target table were built with the old logic and need to be reprocessed.
Backfill needed: Historical data requires reprocessing, either due to source corrections or a new column that needs historical values.
Data quality fix: Erroneous records are already in the target table and can't be corrected through incremental updates alone.
Schema sync: After adding or removing columns when using on_schema_change: fail.
Drift reset: Incremental models naturally drift from source over time. A periodic full refresh (e.g., weekly during low-activity periods) resets this drift.

flowchart TD
    A[Need full refresh?] --> B{Logic changed?}
    B -- Yes --> G[Run --full-refresh]
    B -- No --> C{Schema changed with fail mode?}
    C -- Yes --> G
    C -- No --> D{Backfill or data quality fix?}
    D -- Yes --> G
    D -- No --> E{Scheduled drift reset?}
    E -- Yes --> G
    E -- No --> F[Continue incremental runs]

flowchart TD
    A[Need full refresh?] --> B{Logic changed?}
    B -- Yes --> G[Run --full-refresh]
    B -- No --> C{Schema changed with fail mode?}
    C -- Yes --> G
    C -- No --> D{Backfill or data quality fix?}
    D -- Yes --> G
    D -- No --> E{Scheduled drift reset?}
    E -- Yes --> G
    E -- No --> F[Continue incremental runs]

flowchart TD
    A[Need full refresh?] --> B{Logic changed?}
    B -- Yes --> G[Run --full-refresh]
    B -- No --> C{Schema changed with fail mode?}
    C -- Yes --> G
    C -- No --> D{Backfill or data quality fix?}
    D -- Yes --> G
    D -- No --> E{Scheduled drift reset?}
    E -- Yes --> G
    E -- No --> F[Continue incremental runs]

Decision tree for when to use --full-refresh on a dbt™ incremental model.

The full_refresh config can also be set at the model or project level to permanently override the flag behavior:

models:
  my_project:
    some_model:
      +full_refresh: false  # Never full-refresh this model, even with --full-refresh flag

models:
  my_project:
    some_model:
      +full_refresh: false  # Never full-refresh this model, even with --full-refresh flag

models:
  my_project:
    some_model:
      +full_refresh: false  # Never full-refresh this model, even with --full-refresh flag

Using dbt Incremental Predicates for Large Tables

For tables with billions of rows, even the is_incremental() filter may not prevent expensive full-table scans during the merge or delete step. The incremental_predicates config pushes additional WHERE conditions directly into the merge/delete SQL statement, enabling partition pruning on the target table.

What it does: Adds WHERE conditions to the MERGE or DELETE statement that dbt™ generates, limiting the scan on the existing target table.
Why it matters: Without predicates, a merge statement must scan the entire target table to find matching rows. With predicates, the warehouse can prune partitions and scan only relevant data—potentially reducing costs by 90%+ on large partitioned tables.
When to use: Tables with hundreds of millions to billions of rows, partitioned by date or another key column.

Configuration

{{
  config(
    materialized='incremental',
    unique_key='id',
    incremental_strategy='merge',
    cluster_by=['session_start'],
    incremental_predicates=[
      "DBT_INTERNAL_DEST.session_start > dateadd(day, -7, current_date)"
    ]
  )
}}

select *
from {{ ref('stg_sessions') }}
{% if is_incremental() %}
  where session_start > (select max(session_start) from {{ this }})
{% endif %}

{{
  config(
    materialized='incremental',
    unique_key='id',
    incremental_strategy='merge',
    cluster_by=['session_start'],
    incremental_predicates=[
      "DBT_INTERNAL_DEST.session_start > dateadd(day, -7, current_date)"
    ]
  )
}}

select *
from {{ ref('stg_sessions') }}
{% if is_incremental() %}
  where session_start > (select max(session_start) from {{ this }})
{% endif %}

{{
  config(
    materialized='incremental',
    unique_key='id',
    incremental_strategy='merge',
    cluster_by=['session_start'],
    incremental_predicates=[
      "DBT_INTERNAL_DEST.session_start > dateadd(day, -7, current_date)"
    ]
  )
}}

select *
from {{ ref('stg_sessions') }}
{% if is_incremental() %}
  where session_start > (select max(session_start) from {{ this }})
{% endif %}

The DBT_INTERNAL_DEST alias refers to the existing target table in the generated merge SQL. You can also use DBT_INTERNAL_SOURCE to filter the incoming data:

models:
  - name: my_large_model
    config:
      materialized: incremental
      unique_key: id
      incremental_strategy: merge
      incremental_predicates:
        - "DBT_INTERNAL_DEST.created_at > dateadd(day, -7, current_date)"
        - "DBT_INTERNAL_SOURCE.created_at > dateadd(day, -7, current_date)"

models:
  - name: my_large_model
    config:
      materialized: incremental
      unique_key: id
      incremental_strategy: merge
      incremental_predicates:
        - "DBT_INTERNAL_DEST.created_at > dateadd(day, -7, current_date)"
        - "DBT_INTERNAL_SOURCE.created_at > dateadd(day, -7, current_date)"

models:
  - name: my_large_model
    config:
      materialized: incremental
      unique_key: id
      incremental_strategy: merge
      incremental_predicates:
        - "DBT_INTERNAL_DEST.created_at > dateadd(day, -7, current_date)"
        - "DBT_INTERNAL_SOURCE.created_at > dateadd(day, -7, current_date)"

Tip: Make sure your predicate columns align with your table's partition or cluster columns. A predicate on a non-partitioned column won't enable partition pruning.

Best Practices for dbt Incremental Models

Use Partitioning and Clustering

Warehouse partitioning and clustering are force multipliers for incremental models. Partitioning divides your table into segments (typically by date), and clustering sorts data within partitions by frequently-filtered columns.

On BigQuery, define partitioning in your model config:

{{
  config(
    materialized='incremental',
    partition_by={
      "field": "event_date",
      "data_type": "date",
      "granularity": "day"
    },
    cluster_by=['user_id']
  )
}}

{{
  config(
    materialized='incremental',
    partition_by={
      "field": "event_date",
      "data_type": "date",
      "granularity": "day"
    },
    cluster_by=['user_id']
  )
}}

{{
  config(
    materialized='incremental',
    partition_by={
      "field": "event_date",
      "data_type": "date",
      "granularity": "day"
    },
    cluster_by=['user_id']
  )
}}

On Snowflake, use clustering:

{{
  config(
    materialized='incremental',
    cluster_by=['event_date']
  )
}}

{{
  config(
    materialized='incremental',
    cluster_by=['event_date']
  )
}}

{{
  config(
    materialized='incremental',
    cluster_by=['event_date']
  )
}}

When combined with incremental_predicates, partitioning ensures that merge and delete operations only scan relevant partitions—not the entire table.

Minimize Data Scanned with Targeted Filters

Write narrow, specific filters in your is_incremental() block. Avoid patterns that scan the entire source table:

-- ❌ Scans all source data, then filters
{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

-- ✅ Narrow window reduces source scan
{% if is_incremental() %}
  where updated_at >= dateadd(day, -3, current_date)
{% endif %}

-- ❌ Scans all source data, then filters
{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

-- ✅ Narrow window reduces source scan
{% if is_incremental() %}
  where updated_at >= dateadd(day, -3, current_date)
{% endif %}

-- ❌ Scans all source data, then filters
{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

-- ✅ Narrow window reduces source scan
{% if is_incremental() %}
  where updated_at >= dateadd(day, -3, current_date)
{% endif %}

The second pattern is especially useful when your source table is also large. It limits the source scan to a fixed window, regardless of what is in the target table.

Set Appropriate Unique Keys

Choose keys that genuinely identify unique records. A poorly chosen unique key is the number one cause of duplicate records in incremental models.

Use a single natural key when one exists (e.g., order_id, transaction_id)
Use composite keys when no single column is unique: unique_key=['user_id', 'event_date', 'event_type']
Ensure key columns are never NULL—null keys cause inserts instead of updates
Test uniqueness with dbt™ tests:

models:
  - name: fct_orders
    columns:
      - name: order_id
        tests

models:
  - name: fct_orders
    columns:
      - name: order_id
        tests

models:
  - name: fct_orders
    columns:
      - name: order_id
        tests

Monitor Incremental Model Performance

Track run times and row counts over time. A sudden spike in run time may indicate:

Your filter logic is selecting too many rows (watermark drift)
Upstream data volumes changed unexpectedly
Partition pruning stopped working due to a config change

Paradime's Code IDE provides column-level lineage visibility, letting you trace how changes in upstream models propagate through your pipeline. This is invaluable for identifying why an incremental model's performance suddenly degraded.

Document Your Incremental Logic

Add descriptions explaining your incremental logic, lookback windows, and unique key choices. Future teammates (and future you) will need to understand why you chose a 3-day lookback versus a 7-day lookback, or why the unique key is a composite of three columns.

models:
  - name: fct_orders
    description: >
      Incremental model for order facts. Uses merge strategy with order_id
      as unique key. Lookback window is 3 days to capture late-arriving
      records from the payment processor. Full refresh recommended weekly.
    config:
      materialized: incremental
      unique_key: order_id
      incremental_strategy

models:
  - name: fct_orders
    description: >
      Incremental model for order facts. Uses merge strategy with order_id
      as unique key. Lookback window is 3 days to capture late-arriving
      records from the payment processor. Full refresh recommended weekly.
    config:
      materialized: incremental
      unique_key: order_id
      incremental_strategy

models:
  - name: fct_orders
    description: >
      Incremental model for order facts. Uses merge strategy with order_id
      as unique key. Lookback window is 3 days to capture late-arriving
      records from the payment processor. Full refresh recommended weekly.
    config:
      materialized: incremental
      unique_key: order_id
      incremental_strategy

Troubleshooting Common dbt Incremental Model Issues

Duplicate Records After Incremental Runs

Symptom: Row counts keep growing, and you find multiple rows with the same business key.

Root causes:

Missing unique_key—without it, dbt™ defaults to append-only behavior
unique_key column contains NULL values (NULLs don't match in SQL, so dbt™ inserts instead of updates)
unique_key doesn't actually identify unique records (e.g., using user_id when multiple orders per user exist)
Using append strategy when merge is needed

Fix: Verify your unique key truly identifies unique records. Use composite keys when needed. Add not_null and unique tests. If duplicates already exist in the target table, run --full-refresh to reset.

Missing Data from Late-Arriving Records

Symptom: Some source records never appear in the target table, especially records from a few days ago.

Root cause: Records arrived in the source after your max(updated_at) watermark had already moved past their timestamp.

Fix: Add a lookback window to your filter:

{% if is_incremental() %}
  where updated_at >= (select dateadd(day, -3, max(updated_at)) from {{ this }})
{% endif %}

{% if is_incremental() %}
  where updated_at >= (select dateadd(day, -3, max(updated_at)) from {{ this }})
{% endif %}

{% if is_incremental() %}
  where updated_at >= (select dateadd(day, -3, max(updated_at)) from {{ this }})
{% endif %}

This re-processes the last 3 days of data on every run. Combined with a unique_key, existing rows get updated (not duplicated), and late-arriving records get captured.

Schema Mismatch Errors

Symptom: dbt™ run fails with an error about mismatched columns between source and target.

Root cause: A column was added to or removed from your source data or model SQL, but on_schema_change is set to fail (or defaults to ignore, which can cause silent issues).

Fix: Either:

Update on_schema_change to append_new_columns or sync_all_columns
Run dbt run --full-refresh --select my_model to rebuild the target table with the new schema

Unexpected Full Table Scans

Symptom: Incremental runs are as slow and expensive as full table builds, even though only a small amount of new data exists.

Root causes:

incremental_predicates not configured—merge statement scans the entire target table
Predicate columns don't align with partition/cluster columns, so pruning doesn't occur
is_incremental() filter returns too many rows (overly wide lookback window)

Fix: Add incremental_predicates targeting your partition column. Verify that the predicate column matches your table's partitioning scheme. Narrow your lookback window.

Build Faster dbt Pipelines with Incremental Models

Incremental models are foundational to efficient, cost-effective dbt™ pipelines. They transform what would be expensive, full-table rebuilds into lightweight, targeted updates—reducing warehouse spend, accelerating pipeline execution, and enabling higher-frequency data refreshes.

The key is matching the right strategy to your data: merge for mutable dimension tables, append for immutable event streams, delete+insert for large batch updates, insert_overwrite for partitioned fact tables, and microbatch for high-volume time-series data.

Paradime's Bolt schedules incremental dbt™ runs with TurboCI for slim CI, while Paradime's Code IDE surfaces column-level lineage so teams can confidently iterate on incremental logic without breaking downstream dependencies. When you change a filter condition or update a unique key, you see exactly which downstream models and columns are affected—before you merge the PR.

Start for free to build and orchestrate incremental dbt™ models with AI-powered development tools.

FAQs about dbt Incremental Models

Can I use dbt incremental models with all data warehouses?

Most major warehouses support incremental models, but available dbt™ incremental strategies vary by adapter. Snowflake and Databricks support all five strategies (append, merge, delete+insert, insert_overwrite, microbatch). BigQuery supports append, merge, and insert_overwrite but not delete+insert. Postgres and Redshift support append, merge, delete+insert, and microbatch but not insert_overwrite. Always check the adapter-specific docs before choosing a strategy.

How do I migrate an existing dbt table model to incremental?

Follow these steps: change the materialization to incremental, add a unique_key that identifies unique records, wrap your filter logic in an {% if is_incremental() %} block, then run dbt run --full-refresh --select my_model to establish the baseline table. Subsequent runs will process only new or changed data.

What happens if my dbt incremental model fails mid-run?

Most warehouses wrap incremental operations in transactions, so a mid-run failure typically rolls back changes and leaves your target table unchanged. You can safely retry the run. The dbt™ microbatch strategy takes this further—each batch is independent, so a failure in one batch doesn't affect others, and you can use dbt retry to reprocess only the failed batches.

How do dbt incremental models affect CI/CD pipelines?

In CI environments, incremental models often run against empty or cloned schemas where the target table doesn't exist. Since is_incremental() returns False when the target table is missing, the model builds as a full table in CI. Plan your CI strategy accordingly—consider using the --empty flag in dbt™ 1.8+ to validate SQL compilation without processing data, or accept that CI runs will perform full builds.

Should I use incremental models for small dbt tables?

For small tables (under a few hundred thousand rows), the added complexity of incremental logic—unique keys, filter conditions, schema change handling, lookback windows—often isn't worth it. Table materialization rebuilds quickly, is simpler to maintain, is always accurate, and eliminates the risk of incremental edge cases like late-arriving data and watermark drift. Reserve incremental models for tables where full rebuilds are genuinely too slow or too expensive.

Interested to Learn More?
Try Out the Free 14-Days Trial

Start free trial

Stop Managing Pipelines. Start Shipping Them.

Join the teams that replaced manual dbt™ workflows with agentic AI. Free to start, no credit card required.

Start for free

Stop Managing Pipelines. Start Shipping Them.

Join the teams that replaced manual dbt™ workflows with agentic AI. Free to start, no credit card required.

Start for free

Stop Managing Pipelines. Start Shipping Them.

Join the teams that replaced manual dbt™ workflows with agentic AI. Free to start, no credit card required.

Platform

ADD-ONs

DINOAI

NEW

Programmable Agents

Self-Healing Pipelines

Resources

Industries

About

Legal

Responsible Disclosure Policy

*dbt® and dbt Core® are federally registered trademarks of dbt Labs, Inc. in the United States and various jurisdictions around the world. Paradime is not a partner of dbt Labs. All rights therein are reserved to dbt Labs. Paradime is not a product or service of or endorsed by dbt Labs, Inc.

Platform

ADD-ONs

DINOAI

NEW