expect_column_distinct_count_to_be_greater_than
dbt-expectations
·
Column
·
Uniqueness
How it Works
The expect_column_distinct_count_to_be_greater_than test from the dbt-expectations package validates that the number of distinct values in a column exceeds a specified value. This is useful for detecting data collapse — where what should be a high-cardinality column unexpectedly contains very few unique values, potentially due to a faulty transformation.
Steps and Conditions
Column Selection: Choose the column to evaluate.
Set Threshold: Define the minimum distinct count using
value.Execution: The distinct count is computed and compared against the threshold.
Outcome: Pass if the distinct count exceeds the threshold; fail if it does not.
Example Usage: Product Catalogue
A retail company expects its product_id column in the product_catalogue model to contain at least 1,000 distinct products at all times.
If a transformation error causes product IDs to collapse, this test immediately catches the issue.

