dbt™ Source Freshness Best Practices for Analytics Engineers

Feb 26, 2026

Table of Contents

dbt™ Source Freshness Best Practices for Analytics Engineers

Your dashboard says revenue is up 12% this week. The executive team is celebrating. There's just one problem — the data powering that dashboard is 36 hours old, and the real number is actually down 3%.

This scenario plays out more often than most data teams care to admit, and it's exactly the kind of failure that dbt™ source freshness is designed to prevent. By checking whether your source tables have been updated within an expected time window, dbt™ source freshness acts as an early-warning system that catches stale data before it poisons every model, dashboard, and decision downstream.

In this guide, you'll learn exactly how to configure, run, and operationalize dbt™ source freshness checks so that stale data never silently breaks your analytics pipeline again.

What Is dbt™ Source Freshness?

dbt™ source freshness is a built-in command that checks whether your source tables have been updated within an expected time window. It works by comparing a timestamp column in your source data against the current time, then evaluating the result against thresholds you define. The outcome is one of three states:

Pass (Fresh): Data was updated within the acceptable threshold. Everything is running on schedule.
Warn: Data is approaching staleness. The source hasn't been updated as recently as expected, but it hasn't crossed the critical boundary yet.
Error (Stale): Data exceeds the acceptable threshold. The source is considered stale, and action is required.

This distinction between warn and error is what makes source freshness dbt™ checks so practical. Instead of a binary pass/fail, you get a graduated signal that gives your team time to investigate before a soft alert becomes a hard failure.

Source freshness runs as its own dedicated command — dbt source freshness — rather than as part of dbt run or dbt test. This separation is intentional: it lets you check the health of your raw data before you commit to building models on top of it.

Why Data Freshness Matters for Analytics Pipelines

The freshness of data is directly tied to data quality and organizational trust. If your source data is stale, every model, metric, and dashboard built on top of it becomes unreliable — and most of the time, nobody knows until someone makes a bad decision.

Stale data has a compounding business impact:

Dashboards displaying yesterday's numbers as "real-time." Stakeholders make decisions based on metrics that no longer reflect reality. A marketing team might double down on a campaign that's already underperforming, or a finance team might approve a budget based on outdated revenue figures.
ML models training on outdated features. Feature freshness directly impacts model accuracy. A recommendation engine trained on last week's user behavior will serve irrelevant suggestions, and a fraud detection model running on stale transaction data will miss emerging patterns.
Stakeholders losing trust in the data platform. This is the most expensive consequence. Once business users experience inaccurate data — even once — they start building their own spreadsheets, bypassing the data platform entirely. Rebuilding that trust takes months.

Data freshness isn't just a technical metric. It's a measure of whether your data platform is delivering on its promise to the business. Monitoring it proactively with tools like dbt™ source freshness is the first step toward maintaining that promise.

How dbt™ Source Freshness Works

Under the hood, the dbt source freshness command performs a straightforward but powerful operation. For each source table that has freshness configured, dbt™ runs a query against the loaded_at_field column to find the maximum (most recent) timestamp value. It then compares that value to the current timestamp and evaluates the difference against your warn_after and error_after thresholds.

Here's how each result maps:

Freshness State	What It Means
Pass	The most recent record's timestamp is within the `warn_after` threshold. Data is fresh.
Warn	The time since the most recent record exceeds `warn_after` but is within `error_after`. Data is approaching staleness.
Error	The time since the most recent record exceeds `error_after`. Data is stale and action is required.

A few important details to keep in mind:

Freshness checks run independently. The dbt source freshness command is separate from dbt run, dbt test, and dbt build. Running dbt build will not execute freshness checks.
Each source is evaluated individually. If you have 20 configured sources, you'll get 20 individual pass/warn/error results.
Results are written to sources.json. This artifact file in your target directory contains the detailed freshness results for every evaluated source, which is useful for downstream integrations and reporting.

This architectural separation means you can run freshness checks at any point in your workflow — before a dbt run to fail fast on stale data, after a run for post-execution monitoring, or on its own schedule entirely.

How to Configure dbt™ Source Freshness in Your YAML Files

Configuring dbt™ source freshness requires three things in your sources.yml file: a source definition, a loaded_at_field, and freshness thresholds. The dbt™ source freshness documentation provides the full reference, and the following steps walk through it practically.

1. Define Your Sources

Start by declaring your sources in your dbt™ project's YAML file. This tells dbt™ where your raw data lives:

Each source points to a specific database and schema, and each table within that source maps to a raw table in your warehouse.

2. Add the loaded_at_field Property

The loaded_at_field is the timestamp column that dbt™ uses to determine when data was last loaded. Choosing the right column is critical:

_etl_loaded_at or _loaded_at: Preferred. These are typically set by your ELT tool (Fivetran, Airbyte, etc.) and represent when the row was actually loaded into your warehouse.
updated_at: Acceptable when an ETL timestamp isn't available, but be aware that this reflects when the record was last modified in the source system, not when it arrived in your warehouse.
created_at: Use with caution. This works for append-only tables, but for tables with updates, it will only reflect the most recently created record, not the most recently updated one.

You can set loaded_at_field at the source level (applies to all tables) or override it at the individual table level.

3. Set warn_after and error_after Thresholds

Define your freshness thresholds using the warn_after and error_after properties. Each takes a count and a period (minute, hour, or day):

A few things to note:

You can provide one or both of warn_after and error_after. If only warn_after is set, the check will never error — it will only warn.
To explicitly exclude a table from freshness checks, set freshness: null.
Source-level freshness thresholds apply to all tables in that source unless overridden at the table level.

Note: In dbt™ v1.9+, freshness and loaded_at_field are configured under the config property. Check the dbt™ source freshness docs for your specific version.

How to Run the dbt™ Source Freshness Command

Once your sources are configured, running freshness checks is straightforward. The dbt™ source freshness command docs cover the full syntax, but here's what you need to know in practice.

Run Freshness for All Sources

To check every configured source in your project:

This evaluates every source table that has a loaded_at_field and at least one freshness threshold defined. Tables without freshness configuration are silently skipped.

Run Freshness for Specific Sources

For large projects with many sources, use the --select flag to target specific sources or tables:

This is especially useful in CI/CD pipelines where you only want to validate freshness for the sources relevant to the models being changed.

Interpret Freshness Output and Exit Codes

The command output lists each evaluated source with its status:

For CI/CD integration, the exit codes are what matter:

Exit code 0: All sources passed.
Non-zero exit code: At least one source returned a warn or error state.

This means you can use the command's exit code to gate pipeline execution — if any source is stale, the pipeline halts before building models on bad data.

How to Set Freshness Thresholds Based on Data SLAs

The most common mistake with dbt™ freshness configuration is setting arbitrary thresholds without connecting them to the actual update frequency of the source. Your warn_after and error_after values should be derived from your source's expected update cadence, with a buffer for normal variation.

Real-Time and Streaming Data

For sources updated continuously — event streams, clickstream data, real-time transaction feeds — freshness thresholds should be tight:

If your streaming pipeline is healthy, data should be arriving continuously. A 15-minute gap is a yellow flag; a 60-minute gap means something is broken.

Hourly Batch Loads

For sources refreshed by hourly ETL jobs, build in enough buffer to account for job delays, retries, and occasional slow runs:

Setting warn_after at 2 hours gives you a one-hour buffer beyond the expected cadence. The error_after at 4 hours means two consecutive loads would need to fail before you hit a hard error.

Daily Batch Loads

For daily loads — common for ERP, CRM, and financial system syncs — you need wider windows to account for the naturally longer gaps between updates:

A warn_after of 26 hours gives a 2-hour buffer beyond a 24-hour cycle, while a 48-hour error_after means the data would need to miss an entire daily load cycle before erroring.

Here's a quick reference table:

Source Type	Suggested warn_after	Suggested error_after
Real-time / Streaming	15–30 minutes	1 hour
Hourly batch	2 hours	4 hours
Daily batch	26 hours	48 hours

Pro tip from the dbt™ source freshness documentation: Run your freshness checks at least twice as often as your lowest SLA. If your tightest SLA is 1 hour, run freshness checks every 30 minutes.

How to Filter dbt™ Freshness Checks to Specific Sources

Sometimes you need to narrow what data dbt™ evaluates when checking freshness. The filter property adds a WHERE clause to the freshness query, which is useful for several scenarios as detailed in the dbt™ source freshness tests documentation:

Excluding historical or backfilled data that would skew the freshness calculation.
Checking only recent partitions to avoid expensive full table scans.
Filtering out soft-deleted records that are no longer actively updated.

In this example, dbt™ only looks at events from the last 3 days when calculating freshness, preventing a massive historical backfill from making the source appear fresh when recent data has actually stopped flowing.

Another common use case is filtering out deleted records:

This ensures that the freshness calculation only considers active, relevant records.

Common dbt™ Source Freshness Problems and How to Fix Them

Even with a solid configuration, dbt™ freshness checks can produce unexpected results. Here are the most common issues analytics engineers encounter and how to resolve them.

Missing or Incorrect loaded_at_field

The problem: The freshness command fails or returns unexpected results because the specified loaded_at_field doesn't exist, contains null values, or isn't actually a timestamp column.

The fix: Verify that the column exists and contains valid, non-null timestamps:

If the column has significant null values, consider using a different column or adding a filter to exclude nulls. If the column doesn't exist at all, work with your data engineering team to add an ETL load timestamp during ingestion.

Inconsistent Source Update Schedules

The problem: A source is supposed to update daily, but the actual update time varies — sometimes at 2 AM, sometimes at 6 AM. This causes freshness warnings that aren't true issues.

The fix: Build a larger buffer into your thresholds to accommodate the natural variation. Alternatively, query the source's historical load times to understand the actual distribution:

Use the results to set thresholds that account for the worst-case normal delay, not the average.

False Positives from Timezone Mismatches

The problem: Your data warehouse runs in UTC, but the source system writes timestamps in US/Eastern. The freshness calculation is off by 4–5 hours, causing either false alarms or missed stale data.

The fix: Ensure consistent timezone handling. The best approach is to convert everything to UTC:

If the loaded_at_field is timezone-naive, use a SQL expression to cast it: CONVERT_TIMEZONE('US/Eastern', 'UTC', _loaded_at) (Snowflake syntax).
You can use a SQL expression directly as your loaded_at_field value in the YAML configuration.
As a general rule, always store and compare timestamps in UTC across your entire data stack.

Alternative Ways to Monitor Data Freshness in dbt™

Native dbt™ source freshness covers source-level checks, but dbt™ packages and external tools extend freshness monitoring for teams that need more flexibility — especially for model-level freshness or anomaly-based detection.

Using dbt_expectations

The dbt_expectations package includes the expect_row_values_to_have_recent_data test, which checks that a table has at least one row with a timestamp within a specified window. This is useful for model-level freshness checks:

This approach works on any model or source and gives you row-level freshness validation — something native source freshness doesn't cover.

Using dbt_utils

The recency test from the dbt_utils package provides a simpler alternative:

This fails if the most recent value in the specified field is older than the threshold. It's lightweight and easy to add to any model or source.

Using External Observability Tools

For teams that need more sophisticated monitoring beyond what dbt™ packages provide, external data observability tools offer additional capabilities:

Monte Carlo: Automated anomaly detection for freshness, volume, and schema changes.
Elementary: Open-source dbt™ package that provides freshness monitoring with Slack and email alerts built in.
Datafold: Data diffing and freshness monitoring integrated into CI/CD workflows.

These tools connect to the broader practice of data observability and are especially valuable for large data platforms with hundreds of sources.

How to Add Freshness Checks to Your CI/CD Pipeline

Automating freshness checks ensures that stale data never silently reaches production. By integrating the dbt source freshness command into your CI/CD pipeline, you make data freshness a gate rather than an afterthought.

Add Freshness to Pre-Merge Checks

Run dbt source freshness as part of your pull request checks to catch configuration issues before merging code. This validates that:

New sources have freshness configured correctly.
Modified thresholds are reasonable.
The loaded_at_field exists and is queryable.

Run Freshness in Scheduled Production Jobs

Include freshness checks at the very start of your scheduled dbt™ jobs. This "fail fast" approach prevents building models on stale data:

If the freshness check fails (non-zero exit code), the dbt build command never executes. This is the single most impactful pattern for preventing stale data from reaching production.

Fail Pipelines on Freshness Errors

Use the command's exit codes to halt pipeline execution when a freshness check fails:

For teams that want to allow warnings but fail on errors, parse the sources.json artifact to differentiate between warn and error states before deciding whether to proceed.

How to Automate Freshness Alerts and Reduce MTTR

Catching stale data is only half the battle. The other half is ensuring the right people are notified immediately so they can fix the problem before it impacts the business. Connecting freshness failures to your alerting and incident management systems reduces mean-time-to-repair (MTTR) significantly.

Configure Slack and Email Notifications

Set up alerts that fire automatically when freshness checks fail. In dbt Cloud™, you can configure notifications directly in the job settings to send alerts via Slack or email when a job encounters warnings or errors.

For dbt Core™ users, you can parse the sources.json artifact and send alerts using a script:

The key is routing alerts to the people who can actually act on them — the data engineers who manage the pipelines, not every analyst in the organization.

Create JIRA or Linear Tickets for Freshness Failures

For persistent freshness issues, automatically create tickets in JIRA or Linear to track investigation and resolution. This integrates freshness monitoring into your existing engineering workflows and ensures issues aren't lost in a Slack channel:

Track patterns: Repeated tickets for the same source indicate a systemic problem that needs a permanent fix, not another threshold adjustment.
Assign ownership: Route tickets to the team responsible for the upstream data source.
Measure MTTR: Use ticket timestamps to track how quickly freshness issues are resolved.

Use Webhooks for Custom Alerting Workflows

For advanced setups, trigger custom workflows via webhooks when freshness failures occur. This gives you the flexibility to:

Route alerts to PagerDuty for critical sources.
Trigger automated remediation scripts (e.g., re-running a failed ingestion job).
Update a centralized data health dashboard.

Webhooks give you maximum flexibility to integrate freshness monitoring into whatever incident management system your organization uses.

Run dbt™ Source Freshness Automatically with Paradime Bolt

Paradime Bolt is an orchestration layer that automates freshness checks alongside your dbt™ runs, removing the manual overhead of managing freshness monitoring. Instead of stitching together scripts and cron jobs, Bolt provides a single pane of glass for scheduling, monitoring, and alerting on dbt™ source freshness across your entire project. Start for free.

Automated scheduling: Run freshness checks on any cadence — every 15 minutes for critical streaming sources, hourly for batch loads, or on custom schedules that match your SLAs.
Real-time alerts: Get notified instantly via Slack, email, or webhooks when freshness checks fail, so your team can act before stale data impacts downstream consumers.
Native integrations: Connect to JIRA, Linear, DataDog, and Monte Carlo for end-to-end observability. Freshness failures can automatically create tickets, trigger alerts in your monitoring tools, and update your incident management systems.
Single pane of glass: Monitor all executions, freshness status, and failures in one place. No more jumping between terminal output, sources.json files, and Slack channels to understand the health of your sources.

FAQs About dbt™ Source Freshness

What happens if my source table does not have a timestamp column?

You cannot use native dbt™ source freshness without a timestamp column. The best long-term fix is to add an ETL load timestamp (like _etl_loaded_at or _fivetran_synced) during ingestion. If that's not possible, consider using row-count-based freshness checks via dbt™ packages like dbt_utils or dbt_expectations, or leverage warehouse metadata freshness (available in dbt™ v1.7+ for Snowflake, Redshift, BigQuery, and Databricks) which uses information schema data instead of a table column.

Can I run dbt™ source freshness on incremental models?

No. Source freshness only applies to sources defined in your sources.yml, not to dbt™ models. For checking freshness on incremental models or any other model, use the dbt_expectations.expect_row_values_to_have_recent_data test or the dbt_utils.recency test, or use external observability tools that can monitor model-level freshness.

How do I check freshness for partitioned tables in BigQuery or Snowflake?

Use the filter property to limit freshness checks to recent partitions. This avoids expensive full table scans and significantly reduces query costs:

This ensures dbt™ only scans the most recent partitions rather than the entire table history.

Does dbt™ source freshness work with external tables?

Yes, as long as the external table has a queryable timestamp column that dbt™ can access via your warehouse connection. The external table must be registered in your warehouse's catalog so that dbt™ can run SQL against it. If the external table points to files in cloud storage (S3, GCS, etc.), make sure the timestamp column is surfaced in the table's schema definition.

How often should I run freshness checks in production?

Run freshness checks at the start of each scheduled dbt™ job, and additionally on a cadence that matches your source update frequency. The dbt™ source freshness documentation recommends running checks at least twice as frequently as your lowest SLA:

1-hour SLA: Run freshness every 30 minutes.
24-hour SLA: Run freshness every 12 hours.
For critical sources: Run freshness every 15–30 minutes, regardless of the SLA, to minimize detection latency.

The goal is to detect stale data as quickly as possible, not just at scheduled job times.

Interested to Learn More?
Try Out the Free 14-Days Trial

Start free trial

Stop Managing Pipelines. Start Shipping Them.

Join the teams that replaced manual dbt™ workflows with agentic AI. Free to start, no credit card required.

Start for free

Stop Managing Pipelines. Start Shipping Them.

Join the teams that replaced manual dbt™ workflows with agentic AI. Free to start, no credit card required.

Start for free

Stop Managing Pipelines. Start Shipping Them.

Join the teams that replaced manual dbt™ workflows with agentic AI. Free to start, no credit card required.

Platform

ADD-ONs

DINOAI

NEW

Programmable Agents

Self-Healing Pipelines

Resources

Industries

About

Legal

Responsible Disclosure Policy

*dbt® and dbt Core® are federally registered trademarks of dbt Labs, Inc. in the United States and various jurisdictions around the world. Paradime is not a partner of dbt Labs. All rights therein are reserved to dbt Labs. Paradime is not a product or service of or endorsed by dbt Labs, Inc.

Platform

ADD-ONs

DINOAI

NEW