SQL Keywords
SQL Keywords
AI_COUNT_TOKENS
Feb 23, 2026
·
5
min read
AI_COUNT_TOKENS
Overview
Counts the number of tokens in an input text for a specified model or function. Essential for estimating costs and ensuring inputs don't exceed model limits.
Syntax
AI_COUNT_TOKENS( model_or_function, text )
AI_COUNT_TOKENS( model_or_function, text )
AI_COUNT_TOKENS( model_or_function, text )
Parameters
model_or_function (VARCHAR): Model name (e.g., 'claude-4-sonnet') or function name (e.g., 'AI_EXTRACT')
text (VARCHAR): Text to count tokens for
Use Cases
Cost estimation before processing
Validate input size against model limits
Optimize prompt length
Monitor token usage
Split large documents appropriately
Budget planning for AI operations
Code Examples
Example 1: Count Tokens for Text
SELECT AI_COUNT_TOKENS( 'claude-4-sonnet', 'This is a sample text for token counting' ) AS
SELECT AI_COUNT_TOKENS( 'claude-4-sonnet', 'This is a sample text for token counting' ) AS
SELECT AI_COUNT_TOKENS( 'claude-4-sonnet', 'This is a sample text for token counting' ) AS
Output:
token_count ----------- 9
token_count ----------- 9
token_count ----------- 9
Example 2: Check Before Processing
WITH token_check AS ( SELECT product_id, description, AI_COUNT_TOKENS('llama3.1-70b', description) AS token_count FROM products ) SELECT product_id, CASE WHEN token_count <= 8000 THEN AI_COMPLETE('llama3.1-70b', description) ELSE 'Text too long for model' END AS result FROM
WITH token_check AS ( SELECT product_id, description, AI_COUNT_TOKENS('llama3.1-70b', description) AS token_count FROM products ) SELECT product_id, CASE WHEN token_count <= 8000 THEN AI_COMPLETE('llama3.1-70b', description) ELSE 'Text too long for model' END AS result FROM
WITH token_check AS ( SELECT product_id, description, AI_COUNT_TOKENS('llama3.1-70b', description) AS token_count FROM products ) SELECT product_id, CASE WHEN token_count <= 8000 THEN AI_COMPLETE('llama3.1-70b', description) ELSE 'Text too long for model' END AS result FROM
Example 3: Estimate Cost for Batch Processing
SELECT COUNT(*) AS record_count, SUM(AI_COUNT_TOKENS('mistral-large2', review_text)) AS total_input_tokens, SUM(AI_COUNT_TOKENS('mistral-large2', review_text)) * 0.0001 AS estimated_cost_usd FROM customer_reviews WHERE process_date >= CURRENT_DATE - 30
SELECT COUNT(*) AS record_count, SUM(AI_COUNT_TOKENS('mistral-large2', review_text)) AS total_input_tokens, SUM(AI_COUNT_TOKENS('mistral-large2', review_text)) * 0.0001 AS estimated_cost_usd FROM customer_reviews WHERE process_date >= CURRENT_DATE - 30
SELECT COUNT(*) AS record_count, SUM(AI_COUNT_TOKENS('mistral-large2', review_text)) AS total_input_tokens, SUM(AI_COUNT_TOKENS('mistral-large2', review_text)) * 0.0001 AS estimated_cost_usd FROM customer_reviews WHERE process_date >= CURRENT_DATE - 30
Example 4: Monitor Token Distribution
SELECT CASE WHEN AI_COUNT_TOKENS('claude-4-sonnet', content) < 1000 THEN '< 1K tokens' WHEN AI_COUNT_TOKENS('claude-4-sonnet', content) < 10000 THEN '1K-10K tokens' WHEN AI_COUNT_TOKENS('claude-4-sonnet', content) < 50000 THEN '10K-50K tokens' ELSE '> 50K tokens' END AS token_range, COUNT(*) AS document_count FROM documents GROUP BY
SELECT CASE WHEN AI_COUNT_TOKENS('claude-4-sonnet', content) < 1000 THEN '< 1K tokens' WHEN AI_COUNT_TOKENS('claude-4-sonnet', content) < 10000 THEN '1K-10K tokens' WHEN AI_COUNT_TOKENS('claude-4-sonnet', content) < 50000 THEN '10K-50K tokens' ELSE '> 50K tokens' END AS token_range, COUNT(*) AS document_count FROM documents GROUP BY
SELECT CASE WHEN AI_COUNT_TOKENS('claude-4-sonnet', content) < 1000 THEN '< 1K tokens' WHEN AI_COUNT_TOKENS('claude-4-sonnet', content) < 10000 THEN '1K-10K tokens' WHEN AI_COUNT_TOKENS('claude-4-sonnet', content) < 50000 THEN '10K-50K tokens' ELSE '> 50K tokens' END AS token_range, COUNT(*) AS document_count FROM documents GROUP BY
Example 5: Validate Against Model Limits
SELECT doc_id, title, AI_COUNT_TOKENS('AI_SENTIMENT', content) AS token_count, CASE WHEN AI_COUNT_TOKENS('AI_SENTIMENT', content) > 2048 THEN 'Exceeds limit (2048)' ELSE 'OK' END AS validation FROM documents WHERE validation != 'OK'
SELECT doc_id, title, AI_COUNT_TOKENS('AI_SENTIMENT', content) AS token_count, CASE WHEN AI_COUNT_TOKENS('AI_SENTIMENT', content) > 2048 THEN 'Exceeds limit (2048)' ELSE 'OK' END AS validation FROM documents WHERE validation != 'OK'
SELECT doc_id, title, AI_COUNT_TOKENS('AI_SENTIMENT', content) AS token_count, CASE WHEN AI_COUNT_TOKENS('AI_SENTIMENT', content) > 2048 THEN 'Exceeds limit (2048)' ELSE 'OK' END AS validation FROM documents WHERE validation != 'OK'
Data Output Examples
Token Counts by Model
Text: "Snowflake is a cloud data platform" Model: claude-4-sonnet - 7 tokens Model: llama3.1-70b - 8 tokens Model: mistral-large2 - 7 tokens
Text: "Snowflake is a cloud data platform" Model: claude-4-sonnet - 7 tokens Model: llama3.1-70b - 8 tokens Model: mistral-large2 - 7 tokens
Text: "Snowflake is a cloud data platform" Model: claude-4-sonnet - 7 tokens Model: llama3.1-70b - 8 tokens Model: mistral-large2 - 7 tokens
Cost Estimation
1000 customer reviews Average tokens per review: 150 Total input tokens: 150,000 Estimated cost at $0.10/1M tokens: $0.015
1000 customer reviews Average tokens per review: 150 Total input tokens: 150,000 Estimated cost at $0.10/1M tokens: $0.015
1000 customer reviews Average tokens per review: 150 Total input tokens: 150,000 Estimated cost at $0.10/1M tokens: $0.015
Model Context Limits
Model / Function | Max Tokens |
|---|---|
claude-4-sonnet | 200,000 |
llama3.1-70b | 128,000 |
llama3.1-8b | 128,000 |
mistral-large2 | 128,000 |
AI_SENTIMENT | 2,048 |
AI_EXTRACT | 128,000 |
AI_REDACT | 4,096 |
AI_TRANSLATE | 4,096 |
Cost Considerations
No Token Charges
AI_COUNT_TOKENS itself incurs only compute cost
No token-based charges for the function
Use liberally for planning and validation
Approximate Conversion
1 token ≈ 4 characters
1 token ≈ 0.75 words
1000 tokens ≈ 750 words
Regional Availability
Available in ALL regions
Works for any model, regardless of region
No cross-region restrictions
Best Practices
1. Validate Before Processing
-- Create a validation step WITH validated_inputs AS ( SELECT input_id, input_text, AI_COUNT_TOKENS('llama3.1-70b', input_text) AS tokens, CASE WHEN AI_COUNT_TOKENS('llama3.1-70b', input_text) <= 128000 THEN true ELSE false END AS within_limit FROM input_queue ) SELECT input_id, AI_COMPLETE('llama3.1-70b', input_text) AS result FROM validated_inputs WHERE within_limit = true; -- Handle oversized inputs SELECT input_id, 'Input exceeds token limit' AS error FROM validated_inputs WHERE within_limit = false
-- Create a validation step WITH validated_inputs AS ( SELECT input_id, input_text, AI_COUNT_TOKENS('llama3.1-70b', input_text) AS tokens, CASE WHEN AI_COUNT_TOKENS('llama3.1-70b', input_text) <= 128000 THEN true ELSE false END AS within_limit FROM input_queue ) SELECT input_id, AI_COMPLETE('llama3.1-70b', input_text) AS result FROM validated_inputs WHERE within_limit = true; -- Handle oversized inputs SELECT input_id, 'Input exceeds token limit' AS error FROM validated_inputs WHERE within_limit = false
-- Create a validation step WITH validated_inputs AS ( SELECT input_id, input_text, AI_COUNT_TOKENS('llama3.1-70b', input_text) AS tokens, CASE WHEN AI_COUNT_TOKENS('llama3.1-70b', input_text) <= 128000 THEN true ELSE false END AS within_limit FROM input_queue ) SELECT input_id, AI_COMPLETE('llama3.1-70b', input_text) AS result FROM validated_inputs WHERE within_limit = true; -- Handle oversized inputs SELECT input_id, 'Input exceeds token limit' AS error FROM validated_inputs WHERE within_limit = false
2. Estimate Costs Upfront
-- Cost estimation view CREATE VIEW ai_processing_estimates AS SELECT 'Sentiment Analysis' AS task, COUNT(*) AS record_count, AVG(AI_COUNT_TOKENS('AI_SENTIMENT', feedback)) AS avg_tokens, SUM(AI_COUNT_TOKENS('AI_SENTIMENT', feedback)) AS total_tokens, SUM(AI_COUNT_TOKENS('AI_SENTIMENT', feedback)) * 0.0001 AS est_cost_usd FROM customer_feedback UNION ALL SELECT 'Summarization' AS task, COUNT(*), AVG(AI_COUNT_TOKENS('claude-4-sonnet', content)), SUM(AI_COUNT_TOKENS('claude-4-sonnet', content)), SUM(AI_COUNT_TOKENS('claude-4-sonnet', content)) * 0.0003 FROM
-- Cost estimation view CREATE VIEW ai_processing_estimates AS SELECT 'Sentiment Analysis' AS task, COUNT(*) AS record_count, AVG(AI_COUNT_TOKENS('AI_SENTIMENT', feedback)) AS avg_tokens, SUM(AI_COUNT_TOKENS('AI_SENTIMENT', feedback)) AS total_tokens, SUM(AI_COUNT_TOKENS('AI_SENTIMENT', feedback)) * 0.0001 AS est_cost_usd FROM customer_feedback UNION ALL SELECT 'Summarization' AS task, COUNT(*), AVG(AI_COUNT_TOKENS('claude-4-sonnet', content)), SUM(AI_COUNT_TOKENS('claude-4-sonnet', content)), SUM(AI_COUNT_TOKENS('claude-4-sonnet', content)) * 0.0003 FROM
-- Cost estimation view CREATE VIEW ai_processing_estimates AS SELECT 'Sentiment Analysis' AS task, COUNT(*) AS record_count, AVG(AI_COUNT_TOKENS('AI_SENTIMENT', feedback)) AS avg_tokens, SUM(AI_COUNT_TOKENS('AI_SENTIMENT', feedback)) AS total_tokens, SUM(AI_COUNT_TOKENS('AI_SENTIMENT', feedback)) * 0.0001 AS est_cost_usd FROM customer_feedback UNION ALL SELECT 'Summarization' AS task, COUNT(*), AVG(AI_COUNT_TOKENS('claude-4-sonnet', content)), SUM(AI_COUNT_TOKENS('claude-4-sonnet', content)), SUM(AI_COUNT_TOKENS('claude-4-sonnet', content)) * 0.0003 FROM
3. Optimize Prompts
-- Compare token counts for different prompt versions WITH prompt_options AS ( SELECT 'v1' AS version, 'Please analyze this text and provide detailed insights: ' AS prompt UNION ALL SELECT 'v2', 'Analyze: ' UNION ALL SELECT 'v3', 'Analysis: ' ) SELECT version, prompt, AI_COUNT_TOKENS('claude-4-sonnet', prompt) AS prompt_tokens, AI_COUNT_TOKENS('claude-4-sonnet', prompt || sample_text) AS total_tokens FROM prompt_options, (SELECT 'Sample data text' AS sample_text)
-- Compare token counts for different prompt versions WITH prompt_options AS ( SELECT 'v1' AS version, 'Please analyze this text and provide detailed insights: ' AS prompt UNION ALL SELECT 'v2', 'Analyze: ' UNION ALL SELECT 'v3', 'Analysis: ' ) SELECT version, prompt, AI_COUNT_TOKENS('claude-4-sonnet', prompt) AS prompt_tokens, AI_COUNT_TOKENS('claude-4-sonnet', prompt || sample_text) AS total_tokens FROM prompt_options, (SELECT 'Sample data text' AS sample_text)
-- Compare token counts for different prompt versions WITH prompt_options AS ( SELECT 'v1' AS version, 'Please analyze this text and provide detailed insights: ' AS prompt UNION ALL SELECT 'v2', 'Analyze: ' UNION ALL SELECT 'v3', 'Analysis: ' ) SELECT version, prompt, AI_COUNT_TOKENS('claude-4-sonnet', prompt) AS prompt_tokens, AI_COUNT_TOKENS('claude-4-sonnet', prompt || sample_text) AS total_tokens FROM prompt_options, (SELECT 'Sample data text' AS sample_text)
4. Monitor Token Usage
-- Track token consumption over time CREATE TABLE token_usage_log AS SELECT CURRENT_DATE() AS usage_date, 'AI_COMPLETE' AS function_name, COUNT(*) AS calls, SUM(AI_COUNT_TOKENS('claude-4-sonnet', input_text)) AS total_tokens FROM ai_processing_queue GROUP BY usage_date,
-- Track token consumption over time CREATE TABLE token_usage_log AS SELECT CURRENT_DATE() AS usage_date, 'AI_COMPLETE' AS function_name, COUNT(*) AS calls, SUM(AI_COUNT_TOKENS('claude-4-sonnet', input_text)) AS total_tokens FROM ai_processing_queue GROUP BY usage_date,
-- Track token consumption over time CREATE TABLE token_usage_log AS SELECT CURRENT_DATE() AS usage_date, 'AI_COMPLETE' AS function_name, COUNT(*) AS calls, SUM(AI_COUNT_TOKENS('claude-4-sonnet', input_text)) AS total_tokens FROM ai_processing_queue GROUP BY usage_date,
Common Use Cases
Budget Planning
-- Estimate monthly AI costs WITH monthly_volume AS ( SELECT DATE_TRUNC('month', created_date) AS month, COUNT(*) AS record_count, AVG(AI_COUNT_TOKENS('claude-4-sonnet', content)) AS avg_tokens_per_record FROM content_queue GROUP BY month ) SELECT month, record_count, avg_tokens_per_record, record_count * avg_tokens_per_record AS total_tokens, (record_count * avg_tokens_per_record / 1000000.0) * 0.30 AS estimated_cost_usd FROM monthly_volume ORDER BY month DESC
-- Estimate monthly AI costs WITH monthly_volume AS ( SELECT DATE_TRUNC('month', created_date) AS month, COUNT(*) AS record_count, AVG(AI_COUNT_TOKENS('claude-4-sonnet', content)) AS avg_tokens_per_record FROM content_queue GROUP BY month ) SELECT month, record_count, avg_tokens_per_record, record_count * avg_tokens_per_record AS total_tokens, (record_count * avg_tokens_per_record / 1000000.0) * 0.30 AS estimated_cost_usd FROM monthly_volume ORDER BY month DESC
-- Estimate monthly AI costs WITH monthly_volume AS ( SELECT DATE_TRUNC('month', created_date) AS month, COUNT(*) AS record_count, AVG(AI_COUNT_TOKENS('claude-4-sonnet', content)) AS avg_tokens_per_record FROM content_queue GROUP BY month ) SELECT month, record_count, avg_tokens_per_record, record_count * avg_tokens_per_record AS total_tokens, (record_count * avg_tokens_per_record / 1000000.0) * 0.30 AS estimated_cost_usd FROM monthly_volume ORDER BY month DESC
Smart Document Splitting
-- Split documents that exceed token limits WITH doc_analysis AS ( SELECT doc_id, content, AI_COUNT_TOKENS('llama3.1-70b', content) AS token_count, CEIL(AI_COUNT_TOKENS('llama3.1-70b', content) / 120000.0) AS chunks_needed FROM large_documents ) SELECT doc_id, chunk_num, SUBSTRING( content, (chunk_num - 1) * chunk_size + 1, chunk_size ) AS chunk_content FROM doc_analysis, TABLE(GENERATOR(ROWCOUNT => chunks_needed)), (SELECT 120000 AS chunk_size)
-- Split documents that exceed token limits WITH doc_analysis AS ( SELECT doc_id, content, AI_COUNT_TOKENS('llama3.1-70b', content) AS token_count, CEIL(AI_COUNT_TOKENS('llama3.1-70b', content) / 120000.0) AS chunks_needed FROM large_documents ) SELECT doc_id, chunk_num, SUBSTRING( content, (chunk_num - 1) * chunk_size + 1, chunk_size ) AS chunk_content FROM doc_analysis, TABLE(GENERATOR(ROWCOUNT => chunks_needed)), (SELECT 120000 AS chunk_size)
-- Split documents that exceed token limits WITH doc_analysis AS ( SELECT doc_id, content, AI_COUNT_TOKENS('llama3.1-70b', content) AS token_count, CEIL(AI_COUNT_TOKENS('llama3.1-70b', content) / 120000.0) AS chunks_needed FROM large_documents ) SELECT doc_id, chunk_num, SUBSTRING( content, (chunk_num - 1) * chunk_size + 1, chunk_size ) AS chunk_content FROM doc_analysis, TABLE(GENERATOR(ROWCOUNT => chunks_needed)), (SELECT 120000 AS chunk_size)
Model Selection
-- Choose model based on input size SELECT request_id, input_text, CASE WHEN AI_COUNT_TOKENS('llama3.1-8b', input_text) <= 8000 THEN 'llama3.1-8b' WHEN AI_COUNT_TOKENS('llama3.1-70b', input_text) <= 128000 THEN 'llama3.1-70b' ELSE 'claude-4-sonnet' END AS recommended_model, AI_COUNT_TOKENS('llama3.1-70b', input_text) AS token_count FROM
-- Choose model based on input size SELECT request_id, input_text, CASE WHEN AI_COUNT_TOKENS('llama3.1-8b', input_text) <= 8000 THEN 'llama3.1-8b' WHEN AI_COUNT_TOKENS('llama3.1-70b', input_text) <= 128000 THEN 'llama3.1-70b' ELSE 'claude-4-sonnet' END AS recommended_model, AI_COUNT_TOKENS('llama3.1-70b', input_text) AS token_count FROM
-- Choose model based on input size SELECT request_id, input_text, CASE WHEN AI_COUNT_TOKENS('llama3.1-8b', input_text) <= 8000 THEN 'llama3.1-8b' WHEN AI_COUNT_TOKENS('llama3.1-70b', input_text) <= 128000 THEN 'llama3.1-70b' ELSE 'claude-4-sonnet' END AS recommended_model, AI_COUNT_TOKENS('llama3.1-70b', input_text) AS token_count FROM
Optimization Tips
Reduce Token Count
-- Remove unnecessary whitespace SELECT AI_COUNT_TOKENS('claude-4-sonnet', original_text) AS original_tokens, AI_COUNT_TOKENS('claude-4-sonnet', REGEXP_REPLACE(original_text, '\s+', ' ')) AS optimized_tokens, AI_COUNT_TOKENS('claude-4-sonnet', original_text) - AI_COUNT_TOKENS('claude-4-sonnet', REGEXP_REPLACE(original_text, '\s+', ' ')) AS tokens_saved FROM
-- Remove unnecessary whitespace SELECT AI_COUNT_TOKENS('claude-4-sonnet', original_text) AS original_tokens, AI_COUNT_TOKENS('claude-4-sonnet', REGEXP_REPLACE(original_text, '\s+', ' ')) AS optimized_tokens, AI_COUNT_TOKENS('claude-4-sonnet', original_text) - AI_COUNT_TOKENS('claude-4-sonnet', REGEXP_REPLACE(original_text, '\s+', ' ')) AS tokens_saved FROM
-- Remove unnecessary whitespace SELECT AI_COUNT_TOKENS('claude-4-sonnet', original_text) AS original_tokens, AI_COUNT_TOKENS('claude-4-sonnet', REGEXP_REPLACE(original_text, '\s+', ' ')) AS optimized_tokens, AI_COUNT_TOKENS('claude-4-sonnet', original_text) - AI_COUNT_TOKENS('claude-4-sonnet', REGEXP_REPLACE(original_text, '\s+', ' ')) AS tokens_saved FROM
Batch Size Optimization
-- Find optimal batch size SELECT batch_size, COUNT(*) AS batches, SUM(batch_tokens) AS total_tokens, AVG(batch_tokens) AS avg_batch_tokens FROM ( SELECT FLOOR(ROW_NUMBER() OVER (ORDER BY id) / 100) AS batch_id, 100 AS batch_size, SUM(AI_COUNT_TOKENS('mistral-large2', text)) AS batch_tokens FROM content GROUP BY batch_id )
-- Find optimal batch size SELECT batch_size, COUNT(*) AS batches, SUM(batch_tokens) AS total_tokens, AVG(batch_tokens) AS avg_batch_tokens FROM ( SELECT FLOOR(ROW_NUMBER() OVER (ORDER BY id) / 100) AS batch_id, 100 AS batch_size, SUM(AI_COUNT_TOKENS('mistral-large2', text)) AS batch_tokens FROM content GROUP BY batch_id )
-- Find optimal batch size SELECT batch_size, COUNT(*) AS batches, SUM(batch_tokens) AS total_tokens, AVG(batch_tokens) AS avg_batch_tokens FROM ( SELECT FLOOR(ROW_NUMBER() OVER (ORDER BY id) / 100) AS batch_id, 100 AS batch_size, SUM(AI_COUNT_TOKENS('mistral-large2', text)) AS batch_tokens FROM content GROUP BY batch_id )
Related Functions
AI_COMPLETE - Process text (uses tokens)
AI_EXTRACT - Extract information (check token limits)
AI_SENTIMENT - Analyze sentiment (2048 token limit)
All AI functions - Token counting applies to all
More Articles

Learn
·
Feb 18, 2026
BigQuery Global Queries: How to Run Cross-Region SQL in 2026

Learn
·
Feb 18, 2026
BigQuery Global Queries: How to Run Cross-Region SQL in 2026

Learn
·
Feb 18, 2026
BigQuery Global Queries: How to Run Cross-Region SQL in 2026

Analytics
·
Feb 18, 2026
Context Engineering and AI Quality for Data Teams

Analytics
·
Feb 18, 2026
Context Engineering and AI Quality for Data Teams

Analytics
·
Feb 18, 2026
Context Engineering and AI Quality for Data Teams

Product
·
Feb 18, 2026
Accelerate Analytics Development with Paradime and Tableau

Product
·
Feb 18, 2026
Accelerate Analytics Development with Paradime and Tableau

Product
·
Feb 18, 2026
Accelerate Analytics Development with Paradime and Tableau
Experience Analytics for the AI-Era
Start your 14-day trial today - it's free and no credit card needed
Experience Analytics for the AI-Era
Start your 14-day trial today - it's free and no credit card needed
Experience Analytics for the AI-Era
Start your 14-day trial today - it's free and no credit card needed
Platform
ADD-ONs
Industries
Copyright © 2026 Paradime Labs, Inc.
Made with ❤️ in San Francisco ・ London
*dbt® and dbt Core® are federally registered trademarks of dbt Labs, Inc. in the United States and various jurisdictions around the world. Paradime is not a partner of dbt Labs. All rights therein are reserved to dbt Labs. Paradime is not a product or service of or endorsed by dbt Labs, Inc.
Platform
ADD-ONs
Industries


Copyright © 2026 Paradime Labs, Inc.
Made with ❤️ in San Francisco ・ London
*dbt® and dbt Core® are federally registered trademarks of dbt Labs, Inc. in the United States and various jurisdictions around the world. Paradime is not a partner of dbt Labs. All rights therein are reserved to dbt Labs. Paradime is not a product or service of or endorsed by dbt Labs, Inc.
Platform
ADD-ONs
Industries


Copyright © 2026 Paradime Labs, Inc.
Made with ❤️ in San Francisco ・ London
*dbt® and dbt Core® are federally registered trademarks of dbt Labs, Inc. in the United States and various jurisdictions around the world. Paradime is not a partner of dbt Labs. All rights therein are reserved to dbt Labs. Paradime is not a product or service of or endorsed by dbt Labs, Inc.
