SUBSTRING / SPLIT / CONCAT
Feb 23, 2026
·
5
min read
Category: String Functions
Platform Support:
✅ Snowflake | ✅ BigQuery | ✅ Databricks
Description
String manipulation functions for extracting, splitting, and combining text. SUBSTRING extracts portions of strings, SPLIT divides strings into arrays, and CONCAT combines strings. Essential for text processing, parsing, and data transformation.
Syntax by Function
SUBSTRING:
SPLIT:
CONCAT:
Platform-Specific Notes
Snowflake:
SUBSTRING and SUBSTR are aliases
SPLIT returns ARRAY
SPLIT_PART for specific element
CONCAT works with multiple arguments
BigQuery:
SUBSTR is primary function
SPLIT returns ARRAY
Use ARRAY subscript [OFFSET(n)] or [ORDINAL(n)]
CONCAT has max 5 arguments (use || for more)
Databricks:
Standard Spark SQL functions
SPLIT returns ARRAY
Use bracket notation for array access
CONCAT supports multiple arguments
Example 1: SUBSTRING - Extract Parts of String
All Platforms:
Sample Data (customers table):
full_name | phone | |
|---|---|---|
John Smith | (555) 123-4567 | |
Maria Garcia | (212) 555-0100 | |
Sarah Johnson | (415) 555-7890 |
Result:
full_name | name_prefix | email_suffix | email_domain | area_code | |
|---|---|---|---|---|---|
John Smith | Joh | 555 | |||
Maria Garcia | Mar | 212 | |||
Sarah Johnson | Sar | 415 |
Example 2: SPLIT - Parse Delimited Strings
Snowflake:
BigQuery:
Databricks:
Sample Data (products table):
product_id | product_name | tags |
|---|---|---|
1 | Laptop | electronics,computer,portable |
2 | Desk | furniture,office,wood |
3 | Mouse | electronics,accessory |
Result:
tags | tags_array | first_tag | second_tag | tag_count |
|---|---|---|---|---|
electronics,computer,portable | ["electronics","computer","portable"] | electronics | computer | 3 |
furniture,office,wood | ["furniture","office","wood"] | furniture | office | 3 |
electronics,accessory | ["electronics","accessory"] | electronics | accessory | 2 |
Example 3: CONCAT - Combine Strings
All Platforms:
Sample Data (employees table):
first_name | last_name | street | city | state | zip_code |
|---|---|---|---|---|---|
John | Smith | 123 Main St | New York | NY | 10001 |
Maria | Garcia | 456 Oak Ave | Los Angeles | CA | 90001 |
Sarah | Johnson | 789 Elm Street | Chicago | IL | 60601 |
Result:
first_name | last_name | full_name | generated_email | full_address |
|---|---|---|---|---|
John | Smith | John Smith | 123 Main St, New York, NY 10001 | |
Maria | Garcia | Maria Garcia | 456 Oak Ave, Los Angeles, CA 90001 | |
Sarah | Johnson | Sarah Johnson | 789 Elm Street, Chicago, IL 60601 |
Example 4: Combining All Three Functions
All Platforms:
Sample Data (system_logs table):
log_id | log_entry |
|---|---|
1 | 2024-01-15 10:30:45\ |
2 | 2024-01-15 10:31:22\ |
3 | 2024-01-15 10:32:10\ |
Result:
log_entry | timestamp | action | user_id | formatted_log |
|---|---|---|---|---|
... | 2024-01-15 10:30:45 | LOGIN | 1001 | [2024-01-15 10:30:45] LOGIN by user 1001 |
... | 2024-01-15 10:31:22 | PURCHASE | 2045 | [2024-01-15 10:31:22] PURCHASE by user 2045 |
... | 2024-01-15 10:32:10 | LOGOUT | 1001 | [2024-01-15 10:32:10] LOGOUT by user 1001 |
Example 5: Parsing CSV Data
All Platforms:
Sample Data (csv_import table):
row_id | row_data |
|---|---|
1 | 101, John Smith , john@email.com ,250.00 |
2 | 102, Maria Garcia , maria@email.com ,180.50 |
3 | 103, Sarah Johnson, sarah@email.com ,320.75 |
Result:
customer_id | name | amount | |
|---|---|---|---|
101 | John Smith | 250.00 | |
102 | Maria Garcia | 180.50 | |
103 | Sarah Johnson | 320.75 |
Example 6: Building Dynamic URLs
All Platforms:
Sample Data:
product_id | product_name | category |
|---|---|---|
1001 | Wireless Mouse | Computer Accessories |
2045 | Standing Desk | Office Furniture |
3099 | USB-C Cable | Computer Accessories |
Result:
product_id | product_name | product_url | thumbnail_url |
|---|---|---|---|
1001 | Wireless Mouse | https://store.com/products/computer-accessories/wireless-mouse?id=1001 | |
2045 | Standing Desk | https://store.com/products/office-furniture/standing-desk?id=2045 | |
3099 | USB-C Cable | https://store.com/products/computer-accessories/usb-c-cable?id=3099 |
Related String Functions
Function | Purpose | Example |
|---|---|---|
TRIM | Remove whitespace |
|
UPPER | Convert to uppercase |
|
LOWER | Convert to lowercase |
|
REPLACE | Replace substring |
|
LENGTH | String length |
|
POSITION | Find substring |
|
LEFT | Left n characters |
|
RIGHT | Right n characters |
|
Common Use Cases
SUBSTRING: Extract codes, parse fixed-width data, truncate text
SPLIT: Parse CSVs, process tags, split names
CONCAT: Build full names, create URLs, format addresses
Combined: Log parsing, data transformation, ETL processes
Performance Tips
Use SPLIT_PART: Faster than SPLIT + array access in Snowflake
Avoid nested functions: Simplify complex string operations
Pre-process data: Clean strings during ingestion when possible
Use appropriate function: SUBSTRING faster than REGEXP for simple extraction
Index frequently searched strings: Improves WHERE performance
Best Practices
Handle NULLs: Use COALESCE for NULL strings
Trim inputs: Remove whitespace before processing
Validate lengths: Check string bounds before SUBSTRING
Test edge cases: Empty strings, missing delimiters
Document format: Explain expected input patterns
Use SAFE functions: BigQuery SAFE_OFFSET prevents errors





