MERGE (UPSERT)

Feb 23, 2026

·

5

min read

Category: Data Manipulation Language (DML)

Platform Support:

✅ Snowflake | ✅ BigQuery | ✅ Databricks

Description

MERGE performs "upsert" operations - INSERT new rows or UPDATE existing rows based on a matching condition. It's essential for synchronizing tables, incremental loads, and slowly changing dimensions. MERGE is more efficient than separate INSERT and UPDATE statements.

Syntax

MERGE INTO target_table
USING source_table
ON match_condition
WHEN MATCHED THEN
    UPDATE SET column1 = value1, column2 = value2
WHEN NOT MATCHED THEN
    INSERT (column1, column2) VALUES (value1, value2)
[WHEN NOT MATCHED BY SOURCE THEN
    DELETE];  -- Snowflake only
MERGE INTO target_table
USING source_table
ON match_condition
WHEN MATCHED THEN
    UPDATE SET column1 = value1, column2 = value2
WHEN NOT MATCHED THEN
    INSERT (column1, column2) VALUES (value1, value2)
[WHEN NOT MATCHED BY SOURCE THEN
    DELETE];  -- Snowflake only
MERGE INTO target_table
USING source_table
ON match_condition
WHEN MATCHED THEN
    UPDATE SET column1 = value1, column2 = value2
WHEN NOT MATCHED THEN
    INSERT (column1, column2) VALUES (value1, value2)
[WHEN NOT MATCHED BY SOURCE THEN
    DELETE];  -- Snowflake only

Platform-Specific Notes

Snowflake:

  • Supports all three clauses: MATCHED, NOT MATCHED, NOT MATCHED BY SOURCE

  • Can have multiple WHEN MATCHED/NOT MATCHED clauses with AND conditions

  • Optimized for large-scale upserts

  • Supports DELETE in WHEN NOT MATCHED BY SOURCE

BigQuery:

  • Supports MATCHED and NOT MATCHED

  • Does NOT support NOT MATCHED BY SOURCE natively

  • Requires MERGE quota (DML operations)

  • Can use UPDATE, INSERT, DELETE in clauses

Databricks:

  • Requires Delta Lake tables

  • Supports all MERGE operations

  • Optimized for large-scale data

  • Supports schema evolution

Example 1: Basic MERGE (Upsert)

All Platforms:

-- Update existing customers and insert new ones
MERGE INTO customers target
USING customer_updates source
ON target.customer_id = source.customer_id
WHEN MATCHED THEN
    UPDATE SET
        target.name = source.name,
        target.email = source.email,
        target.updated_at = CURRENT_TIMESTAMP
WHEN NOT MATCHED THEN
    INSERT (customer_id, name, email, created_at)
    VALUES (source.customer_id, source.name, source.email, CURRENT_TIMESTAMP)

-- Update existing customers and insert new ones
MERGE INTO customers target
USING customer_updates source
ON target.customer_id = source.customer_id
WHEN MATCHED THEN
    UPDATE SET
        target.name = source.name,
        target.email = source.email,
        target.updated_at = CURRENT_TIMESTAMP
WHEN NOT MATCHED THEN
    INSERT (customer_id, name, email, created_at)
    VALUES (source.customer_id, source.name, source.email, CURRENT_TIMESTAMP)

-- Update existing customers and insert new ones
MERGE INTO customers target
USING customer_updates source
ON target.customer_id = source.customer_id
WHEN MATCHED THEN
    UPDATE SET
        target.name = source.name,
        target.email = source.email,
        target.updated_at = CURRENT_TIMESTAMP
WHEN NOT MATCHED THEN
    INSERT (customer_id, name, email, created_at)
    VALUES (source.customer_id, source.name, source.email, CURRENT_TIMESTAMP)

Sample Data:

customers (target - BEFORE):

customer_id

name

email

updated_at

1

John Smith

john@email.com

2024-01-10

2

Maria Garcia

maria@email.com

2024-01-10

customer_updates (source):

customer_id

name

email

1

John Smith

john.new@email.com

2

Maria Garcia

maria@email.com

3

Sarah Johnson

sarah@email.com

customers (target - AFTER):

customer_id

name

email

updated_at

1

John Smith

john.new@email.com

2024-01-20

2

Maria Garcia

maria@email.com

2024-01-10

3

Sarah Johnson

sarah@email.com

2024-01-20

Row 1: Updated (email changed), Row 2: No change (matched but same), Row 3: Inserted (new)

Example 2: Conditional MERGE

All Platforms:

-- Update only if source is newer, insert new records
MERGE INTO product_inventory target
USING product_updates source
ON target.product_id = source.product_id
WHEN MATCHED AND source.last_updated > target.last_updated THEN
    UPDATE SET
        target.quantity = source.quantity,
        target.price = source.price,
        target.last_updated = source.last_updated
WHEN NOT MATCHED THEN
    INSERT (product_id, product_name, quantity, price, last_updated)
    VALUES (source.product_id, source.product_name, source.quantity, source.price, source.last_updated)

-- Update only if source is newer, insert new records
MERGE INTO product_inventory target
USING product_updates source
ON target.product_id = source.product_id
WHEN MATCHED AND source.last_updated > target.last_updated THEN
    UPDATE SET
        target.quantity = source.quantity,
        target.price = source.price,
        target.last_updated = source.last_updated
WHEN NOT MATCHED THEN
    INSERT (product_id, product_name, quantity, price, last_updated)
    VALUES (source.product_id, source.product_name, source.quantity, source.price, source.last_updated)

-- Update only if source is newer, insert new records
MERGE INTO product_inventory target
USING product_updates source
ON target.product_id = source.product_id
WHEN MATCHED AND source.last_updated > target.last_updated THEN
    UPDATE SET
        target.quantity = source.quantity,
        target.price = source.price,
        target.last_updated = source.last_updated
WHEN NOT MATCHED THEN
    INSERT (product_id, product_name, quantity, price, last_updated)
    VALUES (source.product_id, source.product_name, source.quantity, source.price, source.last_updated)

Sample Data:

product_inventory (target - BEFORE):

product_id

product_name

quantity

price

last_updated

101

Laptop

50

999.99

2024-01-15

102

Mouse

200

29.99

2024-01-10

product_updates (source):

product_id

product_name

quantity

price

last_updated

101

Laptop

45

999.99

2024-01-12

102

Mouse

180

24.99

2024-01-18

103

Keyboard

150

89.99

2024-01-20

product_inventory (target - AFTER):

product_id

product_name

quantity

price

last_updated

101

Laptop

50

999.99

2024-01-15

102

Mouse

180

24.99

2024-01-18

103

Keyboard

150

89.99

2024-01-20

Product 101: Not updated (source is older), Product 102: Updated (source is newer), Product 103: Inserted

Example 3: MERGE with Aggregation Source

All Platforms:

-- Update customer summary from orders
MERGE INTO customer_summary target
USING (
    SELECT 
        customer_id,
        COUNT(*) as total_orders,
        SUM(total) as lifetime_value,
        AVG(total) as avg_order_value,
        MAX(order_date) as last_order_date
    FROM orders
    WHERE order_date >= '2024-01-01'
    GROUP BY customer_id
) source
ON target.customer_id = source.customer_id
WHEN MATCHED THEN
    UPDATE SET
        target.total_orders = target.total_orders + source.total_orders,
        target.lifetime_value = target.lifetime_value + source.lifetime_value,
        target.last_order_date = source.last_order_date,
        target.updated_at = CURRENT_TIMESTAMP
WHEN NOT MATCHED THEN
    INSERT (customer_id, total_orders, lifetime_value, last_order_date, created_at)
    VALUES (source.customer_id, source.total_orders, source.lifetime_value, source.last_order_date, CURRENT_TIMESTAMP)

-- Update customer summary from orders
MERGE INTO customer_summary target
USING (
    SELECT 
        customer_id,
        COUNT(*) as total_orders,
        SUM(total) as lifetime_value,
        AVG(total) as avg_order_value,
        MAX(order_date) as last_order_date
    FROM orders
    WHERE order_date >= '2024-01-01'
    GROUP BY customer_id
) source
ON target.customer_id = source.customer_id
WHEN MATCHED THEN
    UPDATE SET
        target.total_orders = target.total_orders + source.total_orders,
        target.lifetime_value = target.lifetime_value + source.lifetime_value,
        target.last_order_date = source.last_order_date,
        target.updated_at = CURRENT_TIMESTAMP
WHEN NOT MATCHED THEN
    INSERT (customer_id, total_orders, lifetime_value, last_order_date, created_at)
    VALUES (source.customer_id, source.total_orders, source.lifetime_value, source.last_order_date, CURRENT_TIMESTAMP)

-- Update customer summary from orders
MERGE INTO customer_summary target
USING (
    SELECT 
        customer_id,
        COUNT(*) as total_orders,
        SUM(total) as lifetime_value,
        AVG(total) as avg_order_value,
        MAX(order_date) as last_order_date
    FROM orders
    WHERE order_date >= '2024-01-01'
    GROUP BY customer_id
) source
ON target.customer_id = source.customer_id
WHEN MATCHED THEN
    UPDATE SET
        target.total_orders = target.total_orders + source.total_orders,
        target.lifetime_value = target.lifetime_value + source.lifetime_value,
        target.last_order_date = source.last_order_date,
        target.updated_at = CURRENT_TIMESTAMP
WHEN NOT MATCHED THEN
    INSERT (customer_id, total_orders, lifetime_value, last_order_date, created_at)
    VALUES (source.customer_id, source.total_orders, source.lifetime_value, source.last_order_date, CURRENT_TIMESTAMP)

Result:

  • Existing customers: Updated with new order statistics

  • New customers: Inserted with initial statistics

Example 4: MERGE with DELETE (Snowflake)

Snowflake Only:

-- Sync product catalog - update, insert, and delete discontinued products
MERGE INTO product_catalog target
USING external_product_feed source
ON target.product_id = source.product_id
WHEN MATCHED THEN
    UPDATE SET
        target.product_name = source.product_name,
        target.price = source.price,
        target.status = 'Active',
        target.last_updated = CURRENT_TIMESTAMP
WHEN NOT MATCHED THEN
    INSERT (product_id, product_name, price, status, created_at)
    VALUES (source.product_id, source.product_name, source.price, 'Active', CURRENT_TIMESTAMP)
WHEN NOT MATCHED BY SOURCE THEN
    UPDATE SET
        target.status = 'Discontinued',
        target.discontinued_at = CURRENT_TIMESTAMP

-- Sync product catalog - update, insert, and delete discontinued products
MERGE INTO product_catalog target
USING external_product_feed source
ON target.product_id = source.product_id
WHEN MATCHED THEN
    UPDATE SET
        target.product_name = source.product_name,
        target.price = source.price,
        target.status = 'Active',
        target.last_updated = CURRENT_TIMESTAMP
WHEN NOT MATCHED THEN
    INSERT (product_id, product_name, price, status, created_at)
    VALUES (source.product_id, source.product_name, source.price, 'Active', CURRENT_TIMESTAMP)
WHEN NOT MATCHED BY SOURCE THEN
    UPDATE SET
        target.status = 'Discontinued',
        target.discontinued_at = CURRENT_TIMESTAMP

-- Sync product catalog - update, insert, and delete discontinued products
MERGE INTO product_catalog target
USING external_product_feed source
ON target.product_id = source.product_id
WHEN MATCHED THEN
    UPDATE SET
        target.product_name = source.product_name,
        target.price = source.price,
        target.status = 'Active',
        target.last_updated = CURRENT_TIMESTAMP
WHEN NOT MATCHED THEN
    INSERT (product_id, product_name, price, status, created_at)
    VALUES (source.product_id, source.product_name, source.price, 'Active', CURRENT_TIMESTAMP)
WHEN NOT MATCHED BY SOURCE THEN
    UPDATE SET
        target.status = 'Discontinued',
        target.discontinued_at = CURRENT_TIMESTAMP

BigQuery/Databricks Alternative:

-- Two-step process for BigQuery/Databricks
-- Step 1: Merge
MERGE INTO product_catalog target
USING external_product_feed source
ON target.product_id = source.product_id
WHEN MATCHED THEN UPDATE SET ...
WHEN NOT MATCHED THEN INSERT ...;

-- Step 2: Mark discontinued products
UPDATE product_catalog
SET status = 'Discontinued',
    discontinued_at = CURRENT_TIMESTAMP
WHERE product_id NOT IN (SELECT product_id FROM external_product_feed)
  AND status = 'Active'

-- Two-step process for BigQuery/Databricks
-- Step 1: Merge
MERGE INTO product_catalog target
USING external_product_feed source
ON target.product_id = source.product_id
WHEN MATCHED THEN UPDATE SET ...
WHEN NOT MATCHED THEN INSERT ...;

-- Step 2: Mark discontinued products
UPDATE product_catalog
SET status = 'Discontinued',
    discontinued_at = CURRENT_TIMESTAMP
WHERE product_id NOT IN (SELECT product_id FROM external_product_feed)
  AND status = 'Active'

-- Two-step process for BigQuery/Databricks
-- Step 1: Merge
MERGE INTO product_catalog target
USING external_product_feed source
ON target.product_id = source.product_id
WHEN MATCHED THEN UPDATE SET ...
WHEN NOT MATCHED THEN INSERT ...;

-- Step 2: Mark discontinued products
UPDATE product_catalog
SET status = 'Discontinued',
    discontinued_at = CURRENT_TIMESTAMP
WHERE product_id NOT IN (SELECT product_id FROM external_product_feed)
  AND status = 'Active'

Example 5: Slowly Changing Dimension Type 2

All Platforms:

-- SCD Type 2: Keep history of changes
MERGE INTO customer_dimension target
USING customer_updates source
ON target.customer_id = source.customer_id AND target.is_current = TRUE
WHEN MATCHED AND (
    target.name != source.name OR
    target.email != source.email OR
    target.address != source.address
) THEN
    UPDATE SET
        target.is_current = FALSE,
        target.end_date = CURRENT_DATE,
        target.updated_at = CURRENT_TIMESTAMP
WHEN NOT MATCHED THEN
    INSERT (customer_id, name, email, address, is_current, start_date, end_date, created_at)
    VALUES (source.customer_id, source.name, source.email, source.address, TRUE, CURRENT_DATE, NULL, CURRENT_TIMESTAMP);

-- Insert new versions of changed records
INSERT INTO customer_dimension (customer_id, name, email, address, is_current, start_date, end_date, created_at)
SELECT 
    source.customer_id,
    source.name,
    source.email,
    source.address,
    TRUE as is_current,
    CURRENT_DATE as start_date,
    NULL as end_date,
    CURRENT_TIMESTAMP as created_at
FROM customer_updates source
INNER JOIN customer_dimension target
    ON target.customer_id = source.customer_id 
    AND target.is_current = FALSE
    AND target.end_date = CURRENT_DATE

-- SCD Type 2: Keep history of changes
MERGE INTO customer_dimension target
USING customer_updates source
ON target.customer_id = source.customer_id AND target.is_current = TRUE
WHEN MATCHED AND (
    target.name != source.name OR
    target.email != source.email OR
    target.address != source.address
) THEN
    UPDATE SET
        target.is_current = FALSE,
        target.end_date = CURRENT_DATE,
        target.updated_at = CURRENT_TIMESTAMP
WHEN NOT MATCHED THEN
    INSERT (customer_id, name, email, address, is_current, start_date, end_date, created_at)
    VALUES (source.customer_id, source.name, source.email, source.address, TRUE, CURRENT_DATE, NULL, CURRENT_TIMESTAMP);

-- Insert new versions of changed records
INSERT INTO customer_dimension (customer_id, name, email, address, is_current, start_date, end_date, created_at)
SELECT 
    source.customer_id,
    source.name,
    source.email,
    source.address,
    TRUE as is_current,
    CURRENT_DATE as start_date,
    NULL as end_date,
    CURRENT_TIMESTAMP as created_at
FROM customer_updates source
INNER JOIN customer_dimension target
    ON target.customer_id = source.customer_id 
    AND target.is_current = FALSE
    AND target.end_date = CURRENT_DATE

-- SCD Type 2: Keep history of changes
MERGE INTO customer_dimension target
USING customer_updates source
ON target.customer_id = source.customer_id AND target.is_current = TRUE
WHEN MATCHED AND (
    target.name != source.name OR
    target.email != source.email OR
    target.address != source.address
) THEN
    UPDATE SET
        target.is_current = FALSE,
        target.end_date = CURRENT_DATE,
        target.updated_at = CURRENT_TIMESTAMP
WHEN NOT MATCHED THEN
    INSERT (customer_id, name, email, address, is_current, start_date, end_date, created_at)
    VALUES (source.customer_id, source.name, source.email, source.address, TRUE, CURRENT_DATE, NULL, CURRENT_TIMESTAMP);

-- Insert new versions of changed records
INSERT INTO customer_dimension (customer_id, name, email, address, is_current, start_date, end_date, created_at)
SELECT 
    source.customer_id,
    source.name,
    source.email,
    source.address,
    TRUE as is_current,
    CURRENT_DATE as start_date,
    NULL as end_date,
    CURRENT_TIMESTAMP as created_at
FROM customer_updates source
INNER JOIN customer_dimension target
    ON target.customer_id = source.customer_id 
    AND target.is_current = FALSE
    AND target.end_date = CURRENT_DATE

Result:

  • Changed records: Old version marked as not current with end_date, new version inserted

  • Unchanged records: No action

  • New customers: Inserted as current

Example 6: MERGE with Complex Logic

All Platforms:

MERGE INTO sales_targets target
USING (
    SELECT 
        employee_id,
        EXTRACT(YEAR FROM sale_date) as year,
        EXTRACT(QUARTER FROM sale_date) as quarter,
        SUM(amount) as actual_sales
    FROM sales
    GROUP BY employee_id, EXTRACT(YEAR FROM sale_date), EXTRACT(QUARTER FROM sale_date)
) actual
ON target.employee_id = actual.employee_id
   AND target.year = actual.year
   AND target.quarter = actual.quarter
WHEN MATCHED THEN
    UPDATE SET
        target.actual_sales = actual.actual_sales,
        target.variance = actual.actual_sales - target.target_sales,
        target.variance_pct = (actual.actual_sales - target.target_sales) * 100.0 / target.target_sales,
        target.status = CASE
            WHEN actual.actual_sales >= target.target_sales THEN 'Met'
            WHEN actual.actual_sales >= target.target_sales * 0.9 THEN 'Near'
            ELSE 'Below'
        END,
        target.updated_at = CURRENT_TIMESTAMP
WHEN NOT MATCHED THEN
    INSERT (employee_id, year, quarter, actual_sales, target_sales, variance, status, created_at)
    VALUES (actual.employee_id, actual.year, actual.quarter, actual.actual_sales, 0, actual.actual_sales, 'No Target', CURRENT_TIMESTAMP)

MERGE INTO sales_targets target
USING (
    SELECT 
        employee_id,
        EXTRACT(YEAR FROM sale_date) as year,
        EXTRACT(QUARTER FROM sale_date) as quarter,
        SUM(amount) as actual_sales
    FROM sales
    GROUP BY employee_id, EXTRACT(YEAR FROM sale_date), EXTRACT(QUARTER FROM sale_date)
) actual
ON target.employee_id = actual.employee_id
   AND target.year = actual.year
   AND target.quarter = actual.quarter
WHEN MATCHED THEN
    UPDATE SET
        target.actual_sales = actual.actual_sales,
        target.variance = actual.actual_sales - target.target_sales,
        target.variance_pct = (actual.actual_sales - target.target_sales) * 100.0 / target.target_sales,
        target.status = CASE
            WHEN actual.actual_sales >= target.target_sales THEN 'Met'
            WHEN actual.actual_sales >= target.target_sales * 0.9 THEN 'Near'
            ELSE 'Below'
        END,
        target.updated_at = CURRENT_TIMESTAMP
WHEN NOT MATCHED THEN
    INSERT (employee_id, year, quarter, actual_sales, target_sales, variance, status, created_at)
    VALUES (actual.employee_id, actual.year, actual.quarter, actual.actual_sales, 0, actual.actual_sales, 'No Target', CURRENT_TIMESTAMP)

MERGE INTO sales_targets target
USING (
    SELECT 
        employee_id,
        EXTRACT(YEAR FROM sale_date) as year,
        EXTRACT(QUARTER FROM sale_date) as quarter,
        SUM(amount) as actual_sales
    FROM sales
    GROUP BY employee_id, EXTRACT(YEAR FROM sale_date), EXTRACT(QUARTER FROM sale_date)
) actual
ON target.employee_id = actual.employee_id
   AND target.year = actual.year
   AND target.quarter = actual.quarter
WHEN MATCHED THEN
    UPDATE SET
        target.actual_sales = actual.actual_sales,
        target.variance = actual.actual_sales - target.target_sales,
        target.variance_pct = (actual.actual_sales - target.target_sales) * 100.0 / target.target_sales,
        target.status = CASE
            WHEN actual.actual_sales >= target.target_sales THEN 'Met'
            WHEN actual.actual_sales >= target.target_sales * 0.9 THEN 'Near'
            ELSE 'Below'
        END,
        target.updated_at = CURRENT_TIMESTAMP
WHEN NOT MATCHED THEN
    INSERT (employee_id, year, quarter, actual_sales, target_sales, variance, status, created_at)
    VALUES (actual.employee_id, actual.year, actual.quarter, actual.actual_sales, 0, actual.actual_sales, 'No Target', CURRENT_TIMESTAMP)

MERGE Performance Comparison

Approach

Operations

Efficiency

Use Case

MERGE

1 statement

High

Standard upserts

INSERT + UPDATE

2 statements

Medium

Simple scenarios

DELETE + INSERT

2 statements

Low

Full refresh

Performance Example:

  • MERGE: ~10 seconds for 1M rows

  • Separate INSERT/UPDATE: ~25 seconds for 1M rows

Common Use Cases

  1. Data synchronization: Keep tables in sync across systems

  2. Incremental loads: Load only new/changed data

  3. Slowly changing dimensions: Track historical changes

  4. Customer 360: Consolidate customer data from multiple sources

  5. Inventory updates: Sync stock levels from warehouse systems

  6. Master data management: Maintain golden records

MERGE vs Other Operations

Operation

When to Use

Pros

Cons

MERGE

Upserts, sync

Single operation, efficient

Complex syntax

INSERT + UPDATE

Simple loads

Easy to understand

Two operations

INSERT ... ON CONFLICT

PostgreSQL-style

Simple syntax

Not standard SQL

DELETE + INSERT

Full refresh

Simple

Loses data temporarily

Performance Tips

  1. Index join columns: Critical for ON condition performance

  2. Filter source data: Only include changed records when possible

  3. Use staging tables: Prepare and validate data first

  4. Partition tables: Match on partition keys when possible

  5. Batch operations: Process in chunks for very large datasets

  6. Analyze query plans: Use EXPLAIN to optimize

Best Practices

  1. Test with small datasets: Verify logic before production

  2. Use transactions: Ensure atomicity

  3. Validate match conditions: Ensure uniqueness in ON clause

  4. Monitor performance: Track execution time and rows affected

  5. Handle NULLs: Be explicit about NULL handling in comparisons

  6. Document business logic: Explain when updates vs inserts occur

  7. Add audit columns: Track created_at, updated_at timestamps

  8. Consider idempotency: Ensure MERGE can run multiple times safely

Interested to Learn More?
Try Out the Free 14-Days Trial

More Articles

decorative icon

Experience Analytics for the AI-Era

Start your 14-day trial today - it's free and no credit card needed

decorative icon

Experience Analytics for the AI-Era

Start your 14-day trial today - it's free and no credit card needed

decorative icon

Experience Analytics for the AI-Era

Start your 14-day trial today - it's free and no credit card needed

Copyright © 2026 Paradime Labs, Inc.

Made with ❤️ in San Francisco ・ London

*dbt® and dbt Core® are federally registered trademarks of dbt Labs, Inc. in the United States and various jurisdictions around the world. Paradime is not a partner of dbt Labs. All rights therein are reserved to dbt Labs. Paradime is not a product or service of or endorsed by dbt Labs, Inc.

Copyright © 2026 Paradime Labs, Inc.

Made with ❤️ in San Francisco ・ London

*dbt® and dbt Core® are federally registered trademarks of dbt Labs, Inc. in the United States and various jurisdictions around the world. Paradime is not a partner of dbt Labs. All rights therein are reserved to dbt Labs. Paradime is not a product or service of or endorsed by dbt Labs, Inc.

Copyright © 2026 Paradime Labs, Inc.

Made with ❤️ in San Francisco ・ London

*dbt® and dbt Core® are federally registered trademarks of dbt Labs, Inc. in the United States and various jurisdictions around the world. Paradime is not a partner of dbt Labs. All rights therein are reserved to dbt Labs. Paradime is not a product or service of or endorsed by dbt Labs, Inc.