Category: Embedding & Vector Function (Preview)
Description Generates embeddings (vector representations) for text, images, or multimodal data. Returns an ARRAY that captures the semantic meaning of the input. Used for semantic search, similarity comparison, clustering, and retrieval-augmented generation (RAG).
Use Cases Semantic Search: Find similar documents or products
Recommendation Systems: Recommend similar items
Clustering: Group similar content together
RAG (Retrieval-Augmented Generation): Retrieve relevant context for LLM queries
Deduplication: Identify duplicate or near-duplicate records
Multimodal Search: Search images with text queries or vice versa
Syntax AI.EMBED(
model => 'MODEL_ENDPOINT' ,
content => INPUT_DATA
[ , task_type => 'TASK_TYPE' ]
[ , output_dimensionality => DIMENSIONS ]
[ , connection_id => 'CONNECTION' ]
)
AI.EMBED(
model => 'MODEL_ENDPOINT' ,
content => INPUT_DATA
[ , task_type => 'TASK_TYPE' ]
[ , output_dimensionality => DIMENSIONS ]
[ , connection_id => 'CONNECTION' ]
)
AI.EMBED(
model => 'MODEL_ENDPOINT' ,
content => INPUT_DATA
[ , task_type => 'TASK_TYPE' ]
[ , output_dimensionality => DIMENSIONS ]
[ , connection_id => 'CONNECTION' ]
)
Parameters model: Embedding model endpoint (e.g., 'text-embedding-004', 'multimodal-embedding-002')
content: Text, image ObjectRefRuntime, or multimodal data
task_type (optional): 'RETRIEVAL_QUERY', 'RETRIEVAL_DOCUMENT', 'SEMANTIC_SIMILARITY', 'CLASSIFICATION', 'CLUSTERING'
output_dimensionality (optional): Vector dimensions (e.g., 256, 768, 1024)
connection_id: Vertex AI connection
Code Examples Example 1: Generate Text Embeddings SELECT
product_id,
product_name,
AI.EMBED(
model => 'text-embedding-004' ,
content => CONCAT( product_name, ' ' , description) ,
task_type => 'RETRIEVAL_DOCUMENT' ,
connection_id => 'us.my_vertex_connection'
) AS product_embedding
FROM
SELECT
product_id,
product_name,
AI.EMBED(
model => 'text-embedding-004' ,
content => CONCAT( product_name, ' ' , description) ,
task_type => 'RETRIEVAL_DOCUMENT' ,
connection_id => 'us.my_vertex_connection'
) AS product_embedding
FROM
SELECT
product_id,
product_name,
AI.EMBED(
model => 'text-embedding-004' ,
content => CONCAT( product_name, ' ' , description) ,
task_type => 'RETRIEVAL_DOCUMENT' ,
connection_id => 'us.my_vertex_connection'
) AS product_embedding
FROM
Example 2: Semantic Search with Query Embedding
DECLARE query_embedding ARRAY<float64>;
SET query_embedding = (
SELECT AI.EMBED(
model => 'text-embedding-004' ,
content => 'wireless bluetooth headphones with noise cancellation' ,
task_type => 'RETRIEVAL_QUERY' ,
connection_id => 'us.my_vertex_connection'
)
) ;
SELECT
product_id,
product_name,
AI.SIMILARITY(
query_embedding,
product_embedding
) AS similarity_score
FROM products_with_embeddings
ORDER BY similarity_score DESC
LIMIT 10
DECLARE query_embedding ARRAY<float64>;
SET query_embedding = (
SELECT AI.EMBED(
model => 'text-embedding-004' ,
content => 'wireless bluetooth headphones with noise cancellation' ,
task_type => 'RETRIEVAL_QUERY' ,
connection_id => 'us.my_vertex_connection'
)
) ;
SELECT
product_id,
product_name,
AI.SIMILARITY(
query_embedding,
product_embedding
) AS similarity_score
FROM products_with_embeddings
ORDER BY similarity_score DESC
LIMIT 10
DECLARE query_embedding ARRAY<float64>;
SET query_embedding = (
SELECT AI.EMBED(
model => 'text-embedding-004' ,
content => 'wireless bluetooth headphones with noise cancellation' ,
task_type => 'RETRIEVAL_QUERY' ,
connection_id => 'us.my_vertex_connection'
)
) ;
SELECT
product_id,
product_name,
AI.SIMILARITY(
query_embedding,
product_embedding
) AS similarity_score
FROM products_with_embeddings
ORDER BY similarity_score DESC
LIMIT 10
Example 3: Image Embeddings SELECT
image_id,
AI.EMBED(
model => 'multimodal-embedding-002' ,
content => OBJ.GET_ACCESS_URL( image_ref, 'r' ) ,
connection_id => 'us.my_vertex_connection'
) AS image_embedding
FROM
SELECT
image_id,
AI.EMBED(
model => 'multimodal-embedding-002' ,
content => OBJ.GET_ACCESS_URL( image_ref, 'r' ) ,
connection_id => 'us.my_vertex_connection'
) AS image_embedding
FROM
SELECT
image_id,
AI.EMBED(
model => 'multimodal-embedding-002' ,
content => OBJ.GET_ACCESS_URL( image_ref, 'r' ) ,
connection_id => 'us.my_vertex_connection'
) AS image_embedding
FROM
Example 4: Multimodal Embeddings (Text + Image) SELECT
listing_id,
AI.EMBED(
model => 'multimodal-embedding-002' ,
content => STRUCT(
listing_description AS text,
OBJ.GET_ACCESS_URL( listing_image, 'r' ) AS image
) ,
connection_id => 'us.my_vertex_connection'
) AS multimodal_embedding
FROM
SELECT
listing_id,
AI.EMBED(
model => 'multimodal-embedding-002' ,
content => STRUCT(
listing_description AS text,
OBJ.GET_ACCESS_URL( listing_image, 'r' ) AS image
) ,
connection_id => 'us.my_vertex_connection'
) AS multimodal_embedding
FROM
SELECT
listing_id,
AI.EMBED(
model => 'multimodal-embedding-002' ,
content => STRUCT(
listing_description AS text,
OBJ.GET_ACCESS_URL( listing_image, 'r' ) AS image
) ,
connection_id => 'us.my_vertex_connection'
) AS multimodal_embedding
FROM
Example 5: Create Vector Index for Fast Search
CREATE OR REPLACE TABLE products_embedded AS
SELECT
product_id,
product_name,
description,
AI.EMBED(
model => 'text-embedding-004' ,
content => CONCAT( product_name, ' ' , description) ,
task_type => 'RETRIEVAL_DOCUMENT' ,
output_dimensionality => 768 ,
connection_id => 'us.my_vertex_connection'
) AS embedding
FROM products;
CREATE VECTOR INDEX product_embedding_index
ON products_embedded( embedding)
OPTIONS(
index_type = 'IVF' ,
distance_type = 'COSINE' ,
ivf_options = '{"num_lists": 100}'
)
CREATE OR REPLACE TABLE products_embedded AS
SELECT
product_id,
product_name,
description,
AI.EMBED(
model => 'text-embedding-004' ,
content => CONCAT( product_name, ' ' , description) ,
task_type => 'RETRIEVAL_DOCUMENT' ,
output_dimensionality => 768 ,
connection_id => 'us.my_vertex_connection'
) AS embedding
FROM products;
CREATE VECTOR INDEX product_embedding_index
ON products_embedded( embedding)
OPTIONS(
index_type = 'IVF' ,
distance_type = 'COSINE' ,
ivf_options = '{"num_lists": 100}'
)
CREATE OR REPLACE TABLE products_embedded AS
SELECT
product_id,
product_name,
description,
AI.EMBED(
model => 'text-embedding-004' ,
content => CONCAT( product_name, ' ' , description) ,
task_type => 'RETRIEVAL_DOCUMENT' ,
output_dimensionality => 768 ,
connection_id => 'us.my_vertex_connection'
) AS embedding
FROM products;
CREATE VECTOR INDEX product_embedding_index
ON products_embedded( embedding)
OPTIONS(
index_type = 'IVF' ,
distance_type = 'COSINE' ,
ivf_options = '{"num_lists": 100}'
)
Data Output Examples Text Embeddings product_name
embedding_preview
"Wireless Headphones"
[0.023, -0.145, 0.089, ..., 0.234] (768 dimensions)
"Bluetooth Speaker"
[0.051, -0.112, 0.076, ..., 0.198] (768 dimensions)
Similarity Search Results product_name
similarity_score
"Noise-Cancelling Wireless Headphones"
0.94
"Bluetooth Over-Ear Headphones"
0.89
"Premium Wireless Earbuds"
0.85
Task Types RETRIEVAL_QUERY: Optimize for search queries
RETRIEVAL_DOCUMENT: Optimize for documents to be searched
SEMANTIC_SIMILARITY: General similarity comparison
CLASSIFICATION: Optimize for classification tasks
CLUSTERING: Optimize for grouping similar items
Best Practices Use appropriate task_type: Match task type to your use case
Consistent dimensionality: Use same dimensions for query and documents
Create vector indexes: For large-scale similarity search
Batch generation: Generate embeddings in batch for efficiency
Store embeddings: Persist embeddings to avoid regeneration
Choose right model: text-embedding-004 for text, multimodal for images
When to Use ✅ Use for semantic search and similarity
✅ Use for RAG applications
✅ Use for clustering and classification
✅ Use for multimodal search (text ↔ image)
Alternatives AI.GENERATE_EMBEDDING: Alternative embedding function
Pre-computed embeddings: Import embeddings from external systems
TensorFlow models: Import custom embedding models
Supported Models Text Models:
text-embedding-004 (latest, 768 dimensions)
text-embedding-003
text-multilingual-embedding-002
Multimodal Models:
multimodal-embedding-002 (text, image, video)
Legacy Models:
Platform Support Regions: All Gemini-supported regions + US/EU multi-regions
Preview Status: Currently in Preview (Pre-GA)
Cost: Charged per Vertex AI API call
Vector Search: Requires vector index for large-scale search
Returns ARRAY representing the embedding vector. Dimensions vary by model (typically 256, 768, or 1024). Returns NULL if the Vertex AI call fails.