Mastering dbt™ Source Freshness for Data Quality

Feb 26, 2026

Table of Contents

If your analytics dashboards silently served yesterday's numbers as today's truth, would anyone notice before a bad decision was made? This is the exact problem dbt™ source freshness is designed to solve. It acts as your first line of defense, catching stale data before it cascades into reports, models, and business decisions.

In this guide, you'll learn everything about dbt™ source freshness—from the core concepts and YAML configuration, to running the dbt source freshness command, integrating checks into CI/CD pipelines, troubleshooting common failures, and applying advanced configurations for production-grade data quality monitoring.

What is dbt™ Source Freshness?

dbt™ source freshness is a built-in feature that checks whether your source data has been updated within an acceptable time window. It works by comparing the most recent timestamp in your source table against the current time to determine if data is "fresh" or "stale." This is a key part of the dbt™ sources documentation and helps ensure downstream models always use current data.

Here are the core concepts you need to understand:

Source freshness: A dbt™ feature that monitors how recently your raw data was updated. It answers the question, "When was the last time this source table received new data?"
Staleness: When source data hasn't been refreshed within your defined thresholds. Stale data means your pipeline is operating on outdated information.
SLA compliance: Using freshness checks to validate that upstream systems deliver data on schedule. This is critical for maintaining trust between data teams and business stakeholders.

How dbt™ source freshness evaluates data staleness by comparing the latest record timestamp against defined thresholds.

Why Data Freshness Matters for Pipeline Reliability

Stale source data cascades through the entire pipeline, causing dashboards and reports to show outdated information. This has real-world consequences:

Business decisions made on old data: A marketing team optimizing spend based on yesterday's conversion data could waste thousands of dollars.
Downstream models breaking: Incremental models that expect daily data may produce incorrect aggregations or duplicate records when source data is late.
Wasted compute on transforming stale records: Running expensive transformations on data that hasn't changed burns cloud credits for zero value.

Monitoring data freshness is critical for meeting Service Level Agreement (SLA) commitments with stakeholders and building trust in your data products. Before diving into how it works, understand that this feature prevents bad data from ever entering your analytics ecosystem. It is a proactive safeguard, not a reactive fix.

The cascading impact of stale vs. fresh source data on downstream pipeline reliability and business outcomes.

How dbt™ Source Freshness Works

Conceptually, the freshness mechanism is straightforward: dbt™ queries your source table, finds the maximum value of a specified timestamp column, and compares that value to the current time to determine the data's age.

The loaded_at_field Column

The loaded_at_field is the timestamp column dbt™ uses to determine when data was last loaded. This is typically a column like updated_at, created_at, or a dedicated ETL timestamp like _etl_loaded_at. This column must exist in the source table for freshness checks to work.

Behind the scenes, dbt™ constructs a SQL query like:

If your timestamp column isn't a native timestamp type, you can cast it directly in the configuration:

For sources with non-UTC timestamps, apply a timezone conversion:

Freshness Thresholds with warn_after and error_after

dbt™ uses a two-tier threshold system to classify data freshness:

warn_after: The time limit that triggers a warning status. This indicates the data is getting stale and may require investigation.
error_after: The time limit that triggers an error status. This means the data is unacceptably stale and should likely block the pipeline.

You can specify one or both. If neither is provided, dbt™ will not calculate freshness for that source.

Threshold	Status	Typical Use
Within `warn_after`	Pass	Data is fresh—no action needed
Between `warn_after` and `error_after`	Warn	Data is aging—investigate upstream
Beyond `error_after`	Error	Data is stale—block the pipeline

How dbt™ Calculates Data Staleness

dbt™ runs a query to find the maximum value of the loaded_at_field (e.g., MAX(loaded_at_field)). It then subtracts this timestamp from the current timestamp. The resulting duration is compared against the warn_after and error_after thresholds to determine the final status: pass, warn, or error.

Sequence of how dbt™ calculates freshness by querying the warehouse and comparing the result against configured thresholds.

How to Configure dbt™ Source Freshness in YAML

Configuring source freshness is done within your dbt™ project's YAML files, aligning with official dbt™ source freshness documentation patterns.

1. Define Your Sources

Sources are defined in a .yml file, typically located in your models/ directory (e.g., models/sources.yml). A basic source definition includes the database, schema, and a list of tables. This is a foundational concept in dbt™ sources documentation.

2. Add the loaded_at_field Property

To enable freshness checks, add the loaded_at_field property to a source table definition. You'll need to identify the correct timestamp column in your source data that reflects when it was last updated.

Note: As of dbt™ v1.10, loaded_at_field is configured under the config block. For earlier versions, it's specified directly at the table level.

3. Set warn_after and error_after Thresholds

Define your freshness thresholds using the freshness block. Specify a count and a period (e.g., minute, hour, day). Choose these values based on how frequently your source data is expected to update.

4. Apply Freshness at Source or Table Level

Freshness configurations can be inherited. A freshness block defined at the source level will apply to all tables within that source, but you can override it at the individual table level for more granular control.

Here's a complete example showing both levels:

Source-level freshness: Good when all tables in a source have similar update patterns.
Table-level freshness: Good when tables have different SLAs or update frequencies.
freshness: null: Explicitly disables freshness checks for a specific table.

How to Run the dbt™ Source Freshness Command

This section covers the dbt source freshness command docs and its practical usage, answering the question, "How do I actually check freshness?"

Basic Command Syntax

The core command is dbt source freshness. When you run it, dbt™ executes the freshness queries against your sources and reports the status (Pass, Warn, or Error) for each configured table.

Important: dbt build does not include source freshness checks. You must run dbt source freshness as a separate command.

Selecting Specific Sources

You can use selector syntax with the --select flag to run freshness checks on specific sources or tables:

Command	Description
`dbt source freshness`	Checks freshness for all configured sources
`dbt source freshness --select "source:jaffle_shop"`	Checks freshness for all tables in the `jaffle_shop` source
`dbt source freshness --select "source:jaffle_shop.orders"`	Checks freshness for only the `orders` table

Output Formats and Artifacts

After running, dbt™ generates a sources.json artifact in the target/ directory. This file contains detailed results of the freshness checks, including the max loaded timestamp, the freshness status, and the age of the data.

You can specify a different output location using the -o or --output flag, which is particularly useful for CI/CD integration:

Integrate Source Freshness Checks into CI/CD Pipelines

Connecting freshness checks to automated workflows is key to maintaining data quality at scale. This is where an orchestration solution like Paradime Bolt naturally fits.

Running Freshness Checks on Pull Requests

Add freshness checks to your Pull Request (PR) workflows to catch configuration issues or document stale sources before code is merged. Platforms like Paradime Bolt support this natively, allowing you to validate source status as part of your code review process.

How source freshness checks integrate into a PR workflow to prevent merging code that depends on stale sources.

Blocking Deployments on Stale Data

Use the results of freshness checks to gate production runs. If a source is stale (returns an error status), the dbt source freshness command will exit with a non-zero exit code, which can be used to automatically halt the deployment and prevent the transformation of bad data:

Exit code 0: The invocation completed without error—all sources are fresh.
Exit code 1: The invocation completed with at least one error—one or more sources exceeded the error_after threshold.

When dbt source freshness is added as a step in your pipeline (rather than as a pre-run checkbox), a failure will block all subsequent steps.

Automating Freshness Alerts with Slack and JIRA

Set up automated notifications for when freshness checks fail. This can be done with webhooks or native alerting features in orchestration platforms. Paradime Bolt includes built-in Slack, MS Teams, and JIRA integrations, making it easy to:

Send real-time freshness failure alerts to Slack or MS Teams channels.
Automatically create JIRA tickets for failed runs so issues are tracked and resolved.
Trigger webhooks to update observability tools like DataDog or Monte Carlo.

Troubleshooting Common dbt™ Freshness Failures

This section provides practical debugging guidance for real-world problems practitioners encounter with dbt™ source freshness.

Missing or Unreliable loaded_at_field

If the timestamp column specified in loaded_at_field doesn't exist, contains NULL values, or isn't updated reliably by the upstream process, freshness checks will fail or produce misleading results. The freshness query relies on MAX(loaded_at_field), and if NULLs dominate the column, the result may not reflect reality.

Solution: Work with upstream data owners to ensure a reliable update timestamp is available, or use an alternative column. If the column is a date rather than a timestamp, cast it explicitly:

Timezone Mismatches Between Source and Warehouse

Inconsistencies between the timezone of your loaded_at_field and your data warehouse's default timezone can cause false positives (data appears stale when it isn't) or false negatives (stale data appears fresh).

dbt™ evaluates freshness in UTC. If your source stores local timestamps (e.g., CET, EST), the value must be converted:

Best practice: Ensure all timestamps are stored in UTC at the source level to avoid ambiguity entirely.

Sources Without Timestamp Columns

A common issue is that some source tables lack a reliable timestamp column. In this case, you have several options:

Warehouse metadata: Starting with dbt™ v1.7, some adapters (Snowflake, Redshift, BigQuery, Databricks) can leverage warehouse metadata to calculate freshness without a loaded_at_field.
loaded_at_query (dbt™ v1.10+): Write a custom SQL query to determine freshness from alternative sources like audit tables:
Accept the limitation: For truly static reference tables, set freshness: null to explicitly skip freshness checks.

Intermittent Freshness Failures in Production

Flaky freshness checks can occur due to tight timing windows, late-arriving data, or delays in upstream batch jobs. For example, if your source updates at 6:00 AM but occasionally arrives at 6:15 AM, a freshness check at 6:05 AM will intermittently fail.

Solutions:

Tune your warn_after and error_after thresholds to be more lenient, accounting for realistic delivery windows.
Schedule freshness checks with a buffer after expected load times.
Implement monitoring to track freshness trends and identify systemic delays.

Best Practices for dbt™ Source Freshness Tests

This consolidated guidance helps you build production-grade freshness monitoring, referencing patterns from dbt™ source freshness tests documentation.

Set Thresholds Based on Source SLAs

Align your warn_after and error_after thresholds with the data delivery expectations and SLAs set by upstream data providers. Don't guess—collaborate with source owners to define realistic expectations.

A useful rule of thumb from the dbt™ source freshness documentation: run freshness snapshots at at least double the frequency of your lowest SLA:

Source SLA	Recommended Snapshot Frequency
1 hour	Every 30 minutes
1 day	Every 12 hours
1 week	Daily

Monitor Freshness Trends Over Time

Track freshness results historically to identify degrading sources before they cause major incidents. A source that consistently passes at 11 hours against a 12-hour warn_after threshold is one upstream delay away from failure.

Observability features in platforms like Paradime provide this visibility out of the box, allowing you to discover sources that are frequently stale and track schedule and model performance over time.

Combine Freshness Checks with Data Tests

Freshness alone isn't enough for complete data quality coverage. A source can be "fresh" (recently updated) but still contain bad data. Pair freshness checks with other dbt™ data tests to create a comprehensive validation suite:

Document Freshness Requirements for Stakeholders

Document the SLA expectations for each source directly in your dbt™ project using description fields in your YAML files. This makes freshness requirements clear to all stakeholders and serves as living documentation:

Advanced dbt™ Source Freshness Configurations

For complex scenarios, dbt™ provides power-user configurations to handle edge cases.

Filtering with the filter Clause

The filter property allows you to check freshness on a specific subset of rows. This is critical for partitioned tables where a full table scan would be prohibitively expensive, or where you only want to check the most recent partition.

This generates a freshness query with a WHERE clause, dramatically reducing the data scanned:

Note: The filter property only applies to the freshness query. It does not affect how the source is referenced in your dbt™ models via the {{ source() }} function.

Custom Freshness Queries with loaded_at_query

In scenarios where the default MAX(loaded_at_field) logic is insufficient—for example, complex partitioning schemes, timestamps stored as strings, or streaming data—you can use loaded_at_query (available in dbt™ v1.10+) to define a fully custom SQL query:

Key rules for loaded_at_query:

You cannot specify both loaded_at_field and loaded_at_query for the same table.
The filter property does not work with loaded_at_query since you write the full query yourself.
Supports Jinja templating, including {{ this }} to reference the source table.

Freshness for Partitioned Tables

When working with partitioned tables in BigQuery, Snowflake, or Databricks, you often only want to check the freshness of the latest partition. The filter clause is the standard way to achieve this by restricting the freshness query to the most recent partition key.

For BigQuery specifically, if your table has a mandatory partition filter, the default freshness check will fail without a filter clause. In such cases, always include a filter:

Additionally, for BigQuery, the bigquery_use_batch_source_freshness flag can enable dbt™ to compute source freshness across multiple tables in a single query, improving efficiency.

Automate dbt™ Source Freshness Monitoring for Production Pipelines

Manual freshness checks don't scale. A mature data practice requires automated, scheduled freshness monitoring with integrated alerting.

An automated freshness monitoring workflow using Paradime Bolt with Slack and JIRA integrations for real-time alerting.

Paradime Bolt provides lightning-fast orchestration with built-in freshness monitoring capabilities:

Scheduled freshness checks with cron-based scheduling (e.g., 0 */2 * * * for every 2 hours).
Real-time alerts across email, Slack, and MS Teams for success, failures, and SLA breaches.
JIRA and Linear integration to automatically create tickets for failed freshness runs.
Webhook support to update observability tools like DataDog and Monte Carlo on schedule completion.
Source freshness dashboards to discover sources that are frequently stale and track freshness trends over time.

Start for free to automate your dbt™ source freshness monitoring with Paradime.

FAQs about dbt™ Source Freshness

What happens when dbt™ source freshness fails?

When a source exceeds the error_after threshold, dbt™ returns a non-zero exit code (exit code 1), which signals failure. This can be configured to block downstream pipeline steps or trigger alerts, depending on your orchestration setup. If freshness is run as a step in the job (not as a pre-run checkbox), subsequent steps will not execute.

Can I run dbt™ source freshness checks without running models?

Yes, the dbt source freshness command runs independently of dbt run or dbt build. It only queries your source tables to check timestamp values and does not trigger any model builds. This makes it lightweight and safe to run frequently.

How do I check freshness for only one source table?

Use the --select flag with the source selector syntax:

This targets only the orders table within the jaffle_shop source.

Does dbt™ source freshness work with incremental models?

Source freshness checks operate on source tables, not models. They work independently and are compatible with any downstream model configuration, whether incremental or full-refresh. In fact, combining source freshness checks with incremental models is a best practice: you can verify that source data is fresh before running an expensive incremental build.

How often should I schedule dbt™ source freshness checks?

You should run freshness checks at least as frequently as your fastest-updating source. The dbt™ source freshness docs recommend running snapshots at double the frequency of your lowest SLA. For example, if a source updates hourly, schedule your freshness checks every 30 minutes to catch issues quickly.

Interested to Learn More?
Try Out the Free 14-Days Trial

Start free trial

Stop Managing Pipelines. Start Shipping Them.

Join the teams that replaced manual dbt™ workflows with agentic AI. Free to start, no credit card required.

Start for free

Stop Managing Pipelines. Start Shipping Them.

Join the teams that replaced manual dbt™ workflows with agentic AI. Free to start, no credit card required.

Start for free

Stop Managing Pipelines. Start Shipping Them.

Join the teams that replaced manual dbt™ workflows with agentic AI. Free to start, no credit card required.

Platform

ADD-ONs

DINOAI

NEW

Programmable Agents

Self-Healing Pipelines

Resources

Industries

About

Legal

Responsible Disclosure Policy

*dbt® and dbt Core® are federally registered trademarks of dbt Labs, Inc. in the United States and various jurisdictions around the world. Paradime is not a partner of dbt Labs. All rights therein are reserved to dbt Labs. Paradime is not a product or service of or endorsed by dbt Labs, Inc.

Platform

ADD-ONs

DINOAI

NEW