The DinoAI Prompting Cookbook for Analytics Engineers (dbt™ Focused)
Feb 26, 2026
Using AI to Accelerate dbt Development: A Practical Guide with DinoAI
Introduction
Modern analytics engineering teams face a persistent tension: the need to move fast while maintaining high-quality, well-documented, thoroughly-tested data models. AI-powered tools purpose-built for dbt development—like Paradime's DinoAI—are fundamentally changing how teams approach this challenge. Unlike generic LLMs such as ChatGPT or GitHub Copilot, these specialized tools understand your warehouse schema, your project's conventions, and the nuances of dbt's Jinja-SQL ecosystem.
This guide walks through practical, real-world use cases for leveraging AI in dbt development, from auto-generating documentation and tests to refactoring legacy SQL into modular models, writing reusable macros, debugging failing models, and building safety guardrails for AI-generated code.
1. Auto-Generating Documentation and Tests
The Problem
Maintaining accurate schema.yml files with model-level and column-level descriptions is one of the most tedious tasks in analytics engineering. Teams often skip it, creating a documentation debt that compounds over time.
How AI Solves It
dbt Documentation Basics
In dbt, documentation lives in YAML files using the description: key:
For long-form descriptions, dbt supports docs blocks in .md files:
Using DinoAI for Automated Documentation & Tests
DinoAI's Agent Mode can analyze your SQL model and automatically generate the corresponding YAML file with descriptions and tests. Here's the workflow:
Open DinoAI via the 🪄 icon in Paradime's right panel
Add context by clicking "@" and selecting your model file, or use Directory Context to add an entire folder (e.g.,
/models/marts/finance/)Prompt: "Document and test this dbt model with model-level and column-level descriptions, and add appropriate generic tests"
Review the generated YAML, accept changes, and commit
DinoAI's context-aware prompting with variables like {{editor.currentFile.path}} means you don't need to copy-paste code—it automatically knows which model you're editing and analyzes its structure.
Scaling with .dinoprompts
Teams can create reusable prompt templates in a .dinoprompts YAML file at the project root:
This prompt is version-controlled, shareable, and accessible via the [ shortcut in DinoAI's chat.
2. Refactoring Legacy SQL into Modular dbt Models
The Problem
Many teams inherit monolithic stored procedures or 500+ line SQL scripts that need to be broken into modular, testable dbt models following the staging → intermediate → marts pattern.
The dbt Refactoring Approach
dbt Labs recommends a structured process:
Migrate 1:1: Copy legacy SQL into a
.sqlmodel file. Rundbt runto verify it works without changing logic.Create sources: Replace raw table references with
{{ source('schema', 'table') }}.Implement CTE groupings using a 4-part layout:
Extract staging models: Pull import CTEs into standalone staging models (
stg_*) in/models/staging/.Create intermediate models (
int_*) for complex, reusable transformation logic.Build mart models (
fct_*,dim_*) as the final consumer-facing output.Audit with
audit_helper: Compare refactored output to the original.
How AI Accelerates This
With DinoAI, you can:
Add the legacy SQL as file context using "@" → select the file
Prompt: "Refactor this legacy SQL into modular dbt models following staging → intermediate → marts layers. Create separate files for each model with proper
ref()calls, CTE structure, and source definitions."DinoAI generates the complete set of files:
stg_customers.sql,stg_orders.sql,int_order_summary.sql,fct_customer_orders.sql, and correspondingsources.yml.
Validating Refactored Models with audit_helper
After AI generates refactored models, validate them:
This produces a summary showing matched rows, rows only in the original, and rows only in the refactored version—giving you confidence the refactoring preserved data integrity.
3. Writing Reusable Jinja Macros
The Problem
Repetitive SQL patterns (currency conversions, surrogate key generation, date spine creation) appear across dozens of models, creating maintenance nightmares when business logic changes.
dbt Macro Fundamentals
Macros are reusable SQL/Jinja templates defined in the /macros directory:
Used in models:
Using AI to Generate Macros
DinoAI can identify repeated patterns and suggest macro extraction:
Add multiple models as context using Directory Context on
/models/marts/Prompt: "Identify repeated SQL patterns across these models and create reusable macros for them. Follow dbt best practices: favor readability over DRY-ness, add macro documentation in properties.yml."
DinoAI generates the macro files plus the YAML documentation:
Best Practices for Macros
Favor readability: Don't abstract everything—only standardize "sovereign logic" like financial calculations or surrogate keys
Check dbt-utils first: Many common patterns already exist in
dbt-utilsSet variables at the top: Use
{% set ... %}for lists and configurationsTest compiled SQL: Always run
dbt compileto verify macro outputDocument macro arguments: Use
macros/properties.yml
4. Debugging Failing Models
The Problem
dbt errors span four categories—runtime, compilation, dependency, and database errors—each requiring different debugging approaches. Reading cryptic warehouse error messages is time-consuming.
dbt Error Types and Debugging
Error Type | Cause | Debugging Approach |
|---|---|---|
Runtime | Missing | Check directory, verify |
Compilation | Invalid Jinja, bad YAML indentation, missing | Navigate to file/line in error, use YAML validators |
Dependency | Circular references in DAG | Update |
Database | SQL syntax errors, missing columns, schema issues | Examine |
Using AI for Debugging
DinoAI's Terminal Context is purpose-built for this:
Run your dbt command and observe the failure
Highlight the error output in the terminal
Right-click → "Add to Chat" to send it to DinoAI
Prompt: "Explain this error and suggest a fix"
DinoAI analyzes the error message in the context of your project files, compiled SQL, and warehouse type. Because it has access to your .sql models, schema.yml, and dbt_project.yml through file and directory context, it can identify root causes that a generic LLM cannot—like a missing column that was renamed in an upstream staging model.
Key Debugging Files
target/compiled/— The pure SQL after Jinja rendering (for reviewing logic)target/run/— The actual executed SQL (for reproducing errors)logs/dbt.log— Full execution log including behind-the-scenes queries
5. Building Incremental Models with AI Assistance
The Pattern
Incremental models process only new/changed data, dramatically reducing warehouse costs:
AI-Assisted Incremental Model Creation
Add your staging model as context
Prompt: "Convert this to an incremental model using merge strategy. The unique key is event_id. Filter on event_timestamp for incremental runs. Add incremental_predicates to limit scans to the last 7 days."
DinoAI generates the complete model with proper
is_incremental()logic,{{ this }}references, andcoalesce()null handling.
Best Practices
Always use
coalesce(max(timestamp), '1900-01-01')to handle empty tablesUse
>=(not>) to catch records at the boundaryConfigure
on_schema_changefor production safetyUse
--full-refreshwhen changing model logicSet
unique_keyas a list for composite keys:['user_id', 'session_number']
6. Enforcing Team Standards with .dinorules
The Problem
When multiple engineers use AI tools independently, the generated code diverges in style, naming conventions, and architectural patterns.
The Solution
The .dinorules file at your project root defines coding standards that DinoAI follows automatically:
Because .dinorules is git-tracked, it becomes a living, version-controlled contract for how AI should generate code for your team.
7. Safety Guardrails for AI-Generated Code
PR Review Checklist for AI-Generated dbt Code
AI accelerates development but requires human oversight. Use this checklist:
Data Integrity: Run
dbt test— do all tests pass? Useaudit_helper.compare_queries()to validate against known-good results.Compiled SQL Review: Check
target/compiled/to verify the generated Jinja renders correctly.Naming Conventions: Do model names, column names, and file paths follow
.dinorulesstandards?Materialization: Is the materialization strategy appropriate (view for staging, incremental for large marts)?
Source References: Are all table references using
{{ source() }}or{{ ref() }}—no hardcoded table names?Incremental Logic: If incremental, is
is_incremental()used correctly? Is there a--full-refreshescape hatch?Documentation: Does the
schema.ymlaccurately describe the model's business purpose and every column?Test Coverage: Are primary keys tested for uniqueness/not-null? Are foreign keys tested with
relationships?Performance: Are CTEs filtering early? Are incremental predicates limiting scan ranges?
DAG Impact: Check
dbt ls --select +model_name+to understand upstream/downstream effects.
Evaluating AI Output Quality with dbt_llm_evals
For teams using AI/LLM-generated content within their data pipelines, the dbt_llm_evals package (by Paradime) provides warehouse-native evaluation:
It evaluates LLM outputs on criteria like accuracy, relevance, tone, completeness, and consistency—all running inside your warehouse (Snowflake Cortex, BigQuery Vertex AI, or Databricks AI Functions) with no external API calls.
8. Why Purpose-Built AI Beats Generic LLMs for dbt
Capability | Generic LLM (ChatGPT/Copilot) | DinoAI (Purpose-Built) |
|---|---|---|
Warehouse schema awareness | ❌ No access | ✅ Full schema context across Snowflake, BigQuery, Redshift, Databricks |
Project file context | ❌ Manual copy-paste | ✅ @ references, directory context, inline context |
Team standards enforcement | ❌ Prompt engineering each time | ✅ |
Reusable prompts | ❌ Lost after session | ✅ |
Error debugging | ❌ Requires pasting logs | ✅ Terminal context with right-click → Add to Chat |
Code integration | ❌ Copy-paste into editor | ✅ Direct file creation and editing in IDE |
Git workflow | ❌ Separate tool | ✅ Auto-commit messages, branch management, PR descriptions via |
Cost management | ❌ Full context every message | ✅ Credit Saver Mode with intelligent context summarization |
Conclusion
AI is not replacing analytics engineers—it's amplifying them. The key to successfully leveraging AI in dbt development is combining three elements:
Context: Give AI access to your warehouse schema, project files, and team standards (via
.dinorules, file/directory context, and terminal context)Guardrails: Enforce quality through data tests,
audit_helpervalidation, compiled SQL review, and structured PR checklistsReusability: Codify your best prompts in
.dinoprompts, your standards in.dinorules, and your validation logic in generic tests and macros
Teams using this approach consistently report 3-10x productivity gains while maintaining—or even improving—code quality. The future of analytics engineering isn't AI or humans; it's AI-assisted humans shipping better data models, faster.


