expect_column_distinct_count_to_be_less_than
Feb 24, 2026
·
5
min read
Uniqueness
·
dbt-expectations
·
Column
How it Works
The expect_column_distinct_count_to_be_less_than test from the dbt-expectations package validates that the number of distinct values in a column does not exceed a specified threshold. This is useful for low-cardinality columns like status codes, categories, or flags, where an unexpectedly high number of distinct values would indicate data corruption or unintended variation.
Steps and Conditions
Column Selection: Identify the column to evaluate.
Set Threshold: Define the maximum acceptable distinct count using
value.Execution: The distinct count is computed and compared to the threshold.
Outcome: Pass if the distinct count is less than the threshold; fail if it equals or exceeds it.
Example Usage: Categorical Data
A logistics platform's shipment_carrier column should contain fewer than 20 distinct carrier codes. A much higher count would suggest malformed or free-text carrier entries.
More than 20 distinct carriers would trigger the test, alerting the team to investigate carrier code standardisation in the upstream system.





