SQL Keywords

AI.GENERATE

Feb 23, 2026

Category: General-Purpose AI Function (Preview)

Description

The most flexible AI inference function in BigQuery. Analyzes any combination of text and unstructured data (images, audio, video, PDFs) and can output structured data matching your custom schema. Provides full control over prompts, models, and parameters with detailed response metadata including safety ratings and citations.

Use Cases

Data Enrichment: Extract structured information from unstructured text

Image Analysis: Generate descriptions or metadata from product photos

Content Moderation: Evaluate content safety

Question Answering: Query your data with natural language

Data Validation: Check if data meets criteria (returns boolean)

Scoring: Rate items on custom scales (returns numeric)

Syntax

AI.GENERATE(
  prompt => 'PROMPT',
  [, data => INPUT_DATA ]
  [, output_schema => SCHEMA ]
  [, model => 'MODEL_ENDPOINT' ]
  [, parameters => STRUCT(...) ]
  [, connection_id => 'CONNECTION' ]
)

AI.GENERATE(
  prompt => 'PROMPT',
  [, data => INPUT_DATA ]
  [, output_schema => SCHEMA ]
  [, model => 'MODEL_ENDPOINT' ]
  [, parameters => STRUCT(...) ]
  [, connection_id => 'CONNECTION' ]
)

AI.GENERATE(
  prompt => 'PROMPT',
  [, data => INPUT_DATA ]
  [, output_schema => SCHEMA ]
  [, model => 'MODEL_ENDPOINT' ]
  [, parameters => STRUCT(...) ]
  [, connection_id => 'CONNECTION' ]
)

Parameters

prompt: Natural language instruction/question (required)

data: Text, ObjectRefRuntime values, or mixed data to analyze

output_schema: STRUCT definition for structured output (makes response parseable)

model: Gemini model endpoint (e.g., 'gemini-2.0-flash-exp')

parameters: Generation config (temperature, max_tokens, top_p, etc.)

connection_id: Vertex AI connection (optional, uses end-user credentials if omitted)

Code Examples

Example 1: Simple Text Analysis (Boolean Output)

SELECT 
  product_name,
  description,
  AI.GENERATE(
    prompt => 'Is this product suitable as a gift for children?',
    data => CONCAT('Product: ', product_name, '. ', description),
    output_schema => STRUCT(is_child_friendly AS BOOL),
    connection_id => 'us.my_vertex_connection'
  ).is_child_friendly AS suitable_for_kids
FROM products
WHERE category = 'toys'
LIMIT 10

SELECT 
  product_name,
  description,
  AI.GENERATE(
    prompt => 'Is this product suitable as a gift for children?',
    data => CONCAT('Product: ', product_name, '. ', description),
    output_schema => STRUCT(is_child_friendly AS BOOL),
    connection_id => 'us.my_vertex_connection'
  ).is_child_friendly AS suitable_for_kids
FROM products
WHERE category = 'toys'
LIMIT 10

SELECT 
  product_name,
  description,
  AI.GENERATE(
    prompt => 'Is this product suitable as a gift for children?',
    data => CONCAT('Product: ', product_name, '. ', description),
    output_schema => STRUCT(is_child_friendly AS BOOL),
    connection_id => 'us.my_vertex_connection'
  ).is_child_friendly AS suitable_for_kids
FROM products
WHERE category = 'toys'
LIMIT 10

Example 2: Extract Structured Data from Text

SELECT 
  email_id,
  AI.GENERATE(
    prompt => 'Extract customer information from this support email',
    data => email_body,
    output_schema => STRUCT(
      customer_name AS STRING,
      email_address AS STRING,
      issue_type AS STRING,
      urgency_level AS STRING,
      requested_action AS STRING
    ),
    connection_id => 'us.my_vertex_connection'
  ) AS extracted_data
FROM support_emails
WHERE processed = FALSE

SELECT 
  email_id,
  AI.GENERATE(
    prompt => 'Extract customer information from this support email',
    data => email_body,
    output_schema => STRUCT(
      customer_name AS STRING,
      email_address AS STRING,
      issue_type AS STRING,
      urgency_level AS STRING,
      requested_action AS STRING
    ),
    connection_id => 'us.my_vertex_connection'
  ) AS extracted_data
FROM support_emails
WHERE processed = FALSE

SELECT 
  email_id,
  AI.GENERATE(
    prompt => 'Extract customer information from this support email',
    data => email_body,
    output_schema => STRUCT(
      customer_name AS STRING,
      email_address AS STRING,
      issue_type AS STRING,
      urgency_level AS STRING,
      requested_action AS STRING
    ),
    connection_id => 'us.my_vertex_connection'
  ) AS extracted_data
FROM support_emails
WHERE processed = FALSE

Example 3: Image Analysis with Scoring

SELECT 
  STRING(OBJ.GET_ACCESS_URL(ref,'r').access_urls.read_url) AS image_url,
  AI.GENERATE(
    prompt => 'Rate this product image on a scale of 1-10 for professional quality. Also provide a brief description.',
    data => OBJ.GET_ACCESS_URL(ref, 'r'),
    output_schema => STRUCT(
      quality_score AS INT64,
      description AS STRING,
      suggested_improvements AS STRING
    ),
    model => 'gemini-2.0-flash-exp',
    connection_id => 'us.my_vertex_connection'
  ) AS image_analysis
FROM product_images
LIMIT 5

SELECT 
  STRING(OBJ.GET_ACCESS_URL(ref,'r').access_urls.read_url) AS image_url,
  AI.GENERATE(
    prompt => 'Rate this product image on a scale of 1-10 for professional quality. Also provide a brief description.',
    data => OBJ.GET_ACCESS_URL(ref, 'r'),
    output_schema => STRUCT(
      quality_score AS INT64,
      description AS STRING,
      suggested_improvements AS STRING
    ),
    model => 'gemini-2.0-flash-exp',
    connection_id => 'us.my_vertex_connection'
  ) AS image_analysis
FROM product_images
LIMIT 5

SELECT 
  STRING(OBJ.GET_ACCESS_URL(ref,'r').access_urls.read_url) AS image_url,
  AI.GENERATE(
    prompt => 'Rate this product image on a scale of 1-10 for professional quality. Also provide a brief description.',
    data => OBJ.GET_ACCESS_URL(ref, 'r'),
    output_schema => STRUCT(
      quality_score AS INT64,
      description AS STRING,
      suggested_improvements AS STRING
    ),
    model => 'gemini-2.0-flash-exp',
    connection_id => 'us.my_vertex_connection'
  ) AS image_analysis
FROM product_images
LIMIT 5

Example 4: Advanced - Content Generation with Parameters

SELECT 
  AI.GENERATE(
    prompt => 'Write a product description for this item that highlights its key features',
    data => STRUCT(
      product_name,
      category,
      features,
      price
    ),
    output_schema => STRUCT(
      short_description AS STRING,
      long_description AS STRING,
      key_selling_points AS ARRAY<string>
    ),
    parameters => STRUCT(
      0.7 AS temperature,
      500 AS max_output_tokens,
      0.95 AS top_p
    ),
    connection_id => 'us.my_vertex_connection'
  ) AS generated_content
FROM product_catalog
WHERE description IS NULL

SELECT 
  AI.GENERATE(
    prompt => 'Write a product description for this item that highlights its key features',
    data => STRUCT(
      product_name,
      category,
      features,
      price
    ),
    output_schema => STRUCT(
      short_description AS STRING,
      long_description AS STRING,
      key_selling_points AS ARRAY<string>
    ),
    parameters => STRUCT(
      0.7 AS temperature,
      500 AS max_output_tokens,
      0.95 AS top_p
    ),
    connection_id => 'us.my_vertex_connection'
  ) AS generated_content
FROM product_catalog
WHERE description IS NULL

SELECT 
  AI.GENERATE(
    prompt => 'Write a product description for this item that highlights its key features',
    data => STRUCT(
      product_name,
      category,
      features,
      price
    ),
    output_schema => STRUCT(
      short_description AS STRING,
      long_description AS STRING,
      key_selling_points AS ARRAY<string>
    ),
    parameters => STRUCT(
      0.7 AS temperature,
      500 AS max_output_tokens,
      0.95 AS top_p
    ),
    connection_id => 'us.my_vertex_connection'
  ) AS generated_content
FROM product_catalog
WHERE description IS NULL

Example 5: Multimodal Analysis

-- Analyze both product image and text description
SELECT 
  product_id,
  AI.GENERATE(
    prompt => 'Compare the product image with the text description. Are they consistent? Is anything missing from the description?',
    data => STRUCT(
      product_description AS text,
      OBJ.GET_ACCESS_URL(image_ref, 'r') AS image
    ),
    output_schema => STRUCT(
      is_consistent AS BOOL,
      consistency_score AS FLOAT64,
      discrepancies AS ARRAY<string>,
      missing_from_description AS ARRAY<string>
    ),
    connection_id => 'us.my_vertex_connection'
  ) AS analysis
FROM

-- Analyze both product image and text description
SELECT 
  product_id,
  AI.GENERATE(
    prompt => 'Compare the product image with the text description. Are they consistent? Is anything missing from the description?',
    data => STRUCT(
      product_description AS text,
      OBJ.GET_ACCESS_URL(image_ref, 'r') AS image
    ),
    output_schema => STRUCT(
      is_consistent AS BOOL,
      consistency_score AS FLOAT64,
      discrepancies AS ARRAY<string>,
      missing_from_description AS ARRAY<string>
    ),
    connection_id => 'us.my_vertex_connection'
  ) AS analysis
FROM

-- Analyze both product image and text description
SELECT 
  product_id,
  AI.GENERATE(
    prompt => 'Compare the product image with the text description. Are they consistent? Is anything missing from the description?',
    data => STRUCT(
      product_description AS text,
      OBJ.GET_ACCESS_URL(image_ref, 'r') AS image
    ),
    output_schema => STRUCT(
      is_consistent AS BOOL,
      consistency_score AS FLOAT64,
      discrepancies AS ARRAY<string>,
      missing_from_description AS ARRAY<string>
    ),
    connection_id => 'us.my_vertex_connection'
  ) AS analysis
FROM

Data Output Examples

Simple Boolean Output

product_name	suitable_for_kids
"Wooden Building Blocks Set"	true
"Professional Chef Knife Set"	false
"Art Supplies Kit"	true

Structured Extraction

email_id	customer_name	email_address	issue_type	urgency_level
E001	"John Smith"	"john@email.com"	"billing"	"high"
E002	"Sarah Jones"	"sarah@email.com"	"technical"	"medium"

Image Analysis

image_url	quality_score	description	suggested_improvements
"gs://prod-img-001.jpg"	8	"Professional product photo with good lighting"	"Add white background, adjust shadows"
"gs://prod-img-002.jpg"	4	"Blurry image with poor composition"	"Retake with better camera, improve focus"

Best Practices

Always define output_schema: For structured, parseable results

Use specific, detailed prompts: Better prompts yield better accuracy

Test temperature values: Lower (0.0-0.3) for factual, higher (0.7-1.0) for creative tasks

Monitor token usage: Use max_output_tokens parameter to control costs

Use Provisioned Throughput: For high-volume production workloads

Include examples in prompts: Few-shot examples improve results

When to Use

✅ Use AI.GENERATE when you need custom prompts

✅ Use when analyzing multimodal data (images + text)

✅ Use when you need structured output matching your schema

✅ Use when you need full control over model parameters

Alternatives

AI.CLASSIFY: For simple categorization with managed optimization

AI.SCORE: For ranking tasks with auto-generated scoring rubrics

AI.IF: For semantic filtering in WHERE clauses

AI.GENERATE_BOOL/INT/DOUBLE: For specific scalar output types

AI.GENERATE_TEXT (TVF): For remote model flexibility with table-valued output

Platform Support

Regions: All Gemini-supported regions + US/EU multi-regions

Models: Gemini 2.0 Flash, Gemini 1.5 Pro, Gemini 1.5 Flash

Preview Status: Currently in Preview (Pre-GA)

Cost: Charged per Vertex AI API call

Provisioned Throughput: Supported for consistent high-throughput workloads

Returns

Returns a STRUCT matching your output_schema. If output_schema is not specified, returns the raw model response with metadata including safety ratings and citations.

Interested to Learn More?
Try Out the Free 14-Days Trial

Start free trial

Stop Managing Pipelines. Start Shipping Them.

Join the teams that replaced manual dbt™ workflows with agentic AI. Free to start, no credit card required.

Start for free

Stop Managing Pipelines. Start Shipping Them.

Join the teams that replaced manual dbt™ workflows with agentic AI. Free to start, no credit card required.

Start for free

Stop Managing Pipelines. Start Shipping Them.

Join the teams that replaced manual dbt™ workflows with agentic AI. Free to start, no credit card required.

Platform

ADD-ONs

DINOAI

NEW

Programmable Agents

Self-Healing Pipelines

Resources

Industries

About

Legal

Responsible Disclosure Policy

*dbt® and dbt Core® are federally registered trademarks of dbt Labs, Inc. in the United States and various jurisdictions around the world. Paradime is not a partner of dbt Labs. All rights therein are reserved to dbt Labs. Paradime is not a product or service of or endorsed by dbt Labs, Inc.

Platform

ADD-ONs

DINOAI

NEW