Learn

Learn

BigQuery AI Functions Guide: AI.CLASSIFY, AI.SCORE, and AI.IF for Data Analysis

Complete guide to BigQuery's AI.CLASSIFY, AI.SCORE, and AI.IF functions. Learn SQL-based classification, ranking, and filtering using Vertex AI Gemini with practical examples.

Fabio Di Leta

·

Nov 14, 2025

·

10

min read

TL;DR: BigQuery's AI.CLASSIFY categorizes data into predefined groups, AI.SCORE ranks items by quality or relevance (returns FLOAT64), and AI.IF filters results using natural language conditions (returns BOOL). All three use Vertex AI Gemini, work on text/images/audio, and require a Vertex AI connection. Costs per API call.

Google dropped three AI functions into BigQuery that replace entire ML pipelines.

AI.CLASSIFY buckets things.

AI.SCORE ranks them.

AI.IF filters them.

All three hit Vertex AI Gemini. All three work on text, images, and audio. No model training required.

They're in preview (Pre-GA). Work in all Gemini regions plus US/EU multi-regions.

What Are BigQuery AI Functions?

  • AI.CLASSIFY - Bucket things into categories

  • AI.SCORE - Rank things by quality/relevance

  • AI.IF - Filter by conditions you describe in English

All three hit Vertex AI Gemini behind the scenes. All three work on text, images, and audio.

AI.CLASSIFY: Stop Writing Classification Logic

You give it data and categories. It tells you which category fits.

AI.CLASSIFY returns a STRING value. BigQuery structures your input behind the scenes to improve results.

SELECT
  review,
  AI.CLASSIFY(
    review,
    categories => ['positive', 'neutral', 'negative'],
    connection_id => 'us.example_connection'
  ) AS sentiment
FROM

That's sentiment analysis. One query.

When categories need context:

AI.CLASSIFY(
  support_ticket,
  categories => [
    ('urgent', 'requires response within 1 hour'),
    ('high', 'needs response same day'),
    ('normal', 'can wait 2-3 days')
  ],
  connection_id => 'us.example_connection'
)

Works on structured data too:

AI.CLASSIFY(
  (product_name, description, specs),  -- combines fields
  categories => ['electronics', 'clothing', 'home', 'sports'],
  connection_id => 'us.example_connection'
)

BigQuery concatenates struct fields. Multiple columns become one input.

AI.SCORE: Ranking Without the Math

Give it scoring criteria. Get back a number. Sort by it.

SELECT
  review,
  AI.SCORE(
    ('Rate negativity from 1-10: ', review),
    connection_id => 'us.example_connection'
  ) AS negativity_score
FROM product_reviews
ORDER BY negativity_score DESC
LIMIT 5

Now you know which reviews to fix first.

Resume screening:

SELECT
  candidate_name,
  AI.SCORE(
    ('Rate qualification for senior backend role from 1-10: ', resume_text),
    connection_id => 'us.example_connection'
  ) AS score
FROM applications
ORDER BY score DESC
LIMIT 10

Works on images:

SELECT
  STRING(OBJ.GET_ACCESS_URL(ref,'r').access_urls.read_url) AS url,
  AI.SCORE(
    ('Rate product appeal from 1-10: ', OBJ.GET_ACCESS_URL(ref, 'r')),
    connection_id => 'us.example_connection'
  ) AS appeal
FROM product_images
ORDER BY appeal DESC

Gemini looks at your product photos and tells you which ones work.

AI.IF: WHERE Clauses That Read

Evaluates conditions in plain English. Returns TRUE or FALSE.

SELECT *
FROM reviews
WHERE AI.IF(
  ('Is this review negative?', review_text),
  connection_id => 'us.example_connection'
)

No regex patterns. No keyword lists. Just ask.

Topic filtering:

SELECT title, body
FROM news_articles
WHERE AI.IF(
  ('This article discusses climate change.', body),
  connection_id => 'us.example_connection'
)

Security use cases:

SELECT email_id, subject, body
FROM emails
WHERE AI.IF(
  ('This email is a phishing attempt.', CONCAT(subject, ' ', body)),
  connection_id => 'us.example_connection'
)

Optimize with regular filters:

WHERE category = 'tech'  -- Runs first (cheap)
  AND AI.IF(              -- Runs second (expensive)
    ('Mentions Google.', body),
    connection_id => 'us.example_connection'
  )

BigQuery's optimizer runs cheap filters first, AI filters on the subset.

Works on Images and Audio

All three functions handle multimodal input:

-- Filter images by content
WHERE AI.IF(
  ('Image contains a person wearing red.', OBJ.GET_ACCESS_URL(ref, 'r')),
  connection_id => 'us.example_connection'
)

-- Score image quality
AI.SCORE(
  ('Rate image quality from 1-10: ', OBJ.GET_ACCESS_URL(ref, 'r')),
  connection_id => 'us.example_connection'
)

-- Classify audio by topic
AI.CLASSIFY(
  OBJ.GET_ACCESS_URL(audio_ref, 'r'),
  categories => ['complaint', 'question', 'feedback'],
  connection_id => 'us.example_connection'
)

Vision and audio analysis without leaving SQL.

The Architecture

BigQuery handles the heavy lifting - optimizes query execution, structures your input for better results, manages API calls to Gemini, returns clean SQL types.

Setup (Do This Once)

Create a Vertex AI connection in BigQuery. Two ways to do this:

  • Cloud Console: BigQuery → Admin → Connections

  • CLI: bq mk --connection --location=us --connection_type=CLOUD_RESOURCE your_connection

Grant the Vertex AI User role (roles/aiplatform.user) to the connection's service account. Find the service account in connection details.

Reference it in queries:

connection_id => 'us.your_connection'
-- or full path
connection_id => 'projects/myproject/locations/us/connections/myconnection'

Required IAM permission: bigquery.connections.use on the connection.

Setup docs: https://cloud.google.com/bigquery/docs/create-cloud-resource-connection

IAM roles: https://cloud.google.com/iam/docs/grant-role-console

Costs and Optimization

Every function call hits Vertex AI. Track costs here: https://console.cloud.google.com/vertex-ai

Optimize by:

  • Filtering data before AI functions

  • Caching results in materialized views

  • Combining with cheap WHERE clauses first

  • Running on schedules instead of ad-hoc

Failed calls return NULL. Handle it:

WHERE AI.IF(...) IS NOT FALSE  -- Keeps TRUE, filters NULL
-- OR
WHERE AI.IF(...) = TRUE  -- Strict TRUE only

When to Use What

AI.CLASSIFY - Categories are predefined, you need bucketing not ranking, results need to be discrete

AI.SCORE - You need top/bottom N items, ranking quality matters, ORDER BY makes sense

AI.IF - Filtering by subjective criteria, complex conditions in WHERE/JOIN, traditional filters are getting messy

When NOT to Use These

Skip AI functions when:

  • Simple WHERE clauses work

  • You need deterministic results

  • Sub-100ms latency required

  • Running high-frequency queries

  • Traditional SQL is cheaper and faster

These functions are powerful, not free. Use them where they add value.

The Pattern

Before:

  • 200 lines of CASE statements

  • Regex patterns maintained in spreadsheets

  • Keyword lists that miss edge cases

  • Custom models that need retraining

After:

AI.CLASSIFY(text, ['cat1', 'cat2'], connection_id => '...')
AI.SCORE(('rate this 1-10: ', text), connection_id => '...')
AI.IF(('is this negative?', text), connection_id => '...')

Same outcome. Less infrastructure.

Real-World Combinations

Filter, then score:

SELECT
  review,
  AI.SCORE(('Rate urgency 1-10: ', review), ...) AS urgency
FROM reviews
WHERE AI.IF(('Contains complaint?', review), ...)
ORDER BY urgency DESC

Classify, filter, score:

SELECT
  ticket_id,
  AI.CLASSIFY(description, ['bug', 'feature', 'question'], ...) AS type,
  AI.SCORE(('Rate severity 1-10: ', description), ...) AS severity
FROM support_tickets
WHERE AI.IF(('Mentions payment issues?', description), ...)

Join using AI logic:

SELECT p.name, i.url
FROM products p
JOIN images i
ON AI.IF(('Image shows: ', p.name, i.ref), ...)

Mix and match. They're just SQL functions.

Limitations

Preview status - things might change. Features in preview are "as is" with limited support.

Not deterministic - same input might get different outputs. LLMs aren't databases.

Latency - API calls to Vertex AI. Not for real-time, high-throughput queries.

Cost - every row, every call. Adds up on large tables.

Regional - works in Gemini regions (https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations) + US/EU multi-regions.

Bottom Line

Classification, scoring, and filtering used to mean: build pipelines, train models, maintain infrastructure, version everything.

Now it means: write SQL.

Results aren't perfect. They're good enough, available immediately, and improve as Gemini improves.

Your data warehouse just got smarter. Try it.

Frequently Asked Questions

What is AI.CLASSIFY in BigQuery?

AI.CLASSIFY is a BigQuery SQL function that uses Vertex AI Gemini to automatically categorize data into predefined categories. It returns a STRING value matching one of the categories you provide. Common uses include sentiment analysis, support ticket routing, and product categorization.

How much does using BigQuery AI functions cost?

BigQuery AI functions incur charges in Vertex AI for each API call. Costs depend on the Gemini model pricing and number of rows processed. Track costs in the Vertex AI console at https://console.cloud.google.com/vertex-ai. Optimize costs by filtering data before applying AI functions and using materialized views.

What's the difference between AI.SCORE and AI.IF?

AI.SCORE returns a FLOAT64 numeric score for ranking items (use with ORDER BY), while AI.IF returns a BOOL (TRUE/FALSE) for filtering rows (use in WHERE clauses). Use AI.SCORE when you need top N items ranked by quality. Use AI.IF when filtering by conditions described in natural language.

Can BigQuery AI functions analyze images and audio?

Yes. All three functions (AI.CLASSIFY, AI.SCORE, AI.IF) work on images and audio files stored in Cloud Storage. Use the OBJ.GET_ACCESS_URL function with ObjectRef values to pass multimodal content. Supported formats include PNG, JPG, MP3, and others listed in the Gemini API documentation.

Do I need to train models to use BigQuery AI functions?

No. BigQuery AI functions use pre-trained Vertex AI Gemini models. No model training, fine-tuning, or ML pipeline setup is required. You only need a Vertex AI connection with proper IAM permissions (Vertex AI User role).

Which regions support BigQuery AI functions?

BigQuery AI functions work in all regions that support Gemini models, plus US and EU multi-regions. Check current regional availability at https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations

How do I set up BigQuery AI functions?

Create a Vertex AI connection in BigQuery, grant the Vertex AI User IAM role to the connection's service account, then reference the connection in your queries using connection_id => 'us.your_connection'. Full setup documentation: https://cloud.google.com/bigquery/docs/create-cloud-resource-connection

Are BigQuery AI functions deterministic?

No. Since they use large language models, the same input may produce different outputs across runs. Failed API calls return NULL. For deterministic results, use traditional SQL WHERE clauses and CASE statements.

*All three functions are in preview. Docs: https://cloud.google.com/bigquery/docs/reference/standard-sql/ai-functions*

Interested to Learn More?
Try Out the Free 14-Days Trial

More Articles

Experience Analytics for the AI-Era

Start your 14-day trial today - it's free and no credit card needed

Experience Analytics for the AI-Era

Start your 14-day trial today - it's free and no credit card needed

Experience Analytics for the AI-Era

Start your 14-day trial today - it's free and no credit card needed

Copyright © 2025 Paradime Labs, Inc.

Made with ❤️ in San Francisco ・ London

*dbt® and dbt Core® are federally registered trademarks of dbt Labs, Inc. in the United States and various jurisdictions around the world. Paradime is not a partner of dbt Labs. All rights therein are reserved to dbt Labs. Paradime is not a product or service of or endorsed by dbt Labs, Inc.

Copyright © 2025 Paradime Labs, Inc.

Made with ❤️ in San Francisco ・ London

*dbt® and dbt Core® are federally registered trademarks of dbt Labs, Inc. in the United States and various jurisdictions around the world. Paradime is not a partner of dbt Labs. All rights therein are reserved to dbt Labs. Paradime is not a product or service of or endorsed by dbt Labs, Inc.

Copyright © 2025 Paradime Labs, Inc.

Made with ❤️ in San Francisco ・ London

*dbt® and dbt Core® are federally registered trademarks of dbt Labs, Inc. in the United States and various jurisdictions around the world. Paradime is not a partner of dbt Labs. All rights therein are reserved to dbt Labs. Paradime is not a product or service of or endorsed by dbt Labs, Inc.