How to Solve a Complex Data Engineering Merge Problem with #Snowflake “MERGE ALL BY NAME”

Karthikeyan Shanthakumar

Published Oct 13, 2025

Snowflake recently introduced or I should say, they enhanced the Merge statement to include ALL BY NAME options. So, I decided to try and how it really enable the data engineers to leverage the new feature.

Maintaining MERGE logic in ELT pipelines has always been a slow, error-prone process, especially when schemas evolve frequently. With constant upstream changes, downstream analytics teams like ours are often left catching up, ensuring every transformation still aligns with the latest source structure.

How it used to be on the Old Way

Previously, each MERGE meant listing every column explicitly:

MERGE INTO target t
USING source s
ON t.id = s.id
WHEN MATCHED THEN 
  UPDATE SET 
    t.col1 = s.col1,
    t.col2 = s.col2
WHEN NOT MATCHED THEN 
  INSERT (col1, col2)
  VALUES (s.col1, s.col2);

Any new column upstream meant editing MERGE scripts by hand-slowing down agile data operations.

The Snowflake Cloud's New Way – MERGE ALL BY NAME

Snowflake’s most recent release introduced MERGE ALL BY NAME, which maps columns automatically based on their names:

MERGE INTO target AS t

USING source AS s

ON t.id = s.id

WHEN MATCHED THEN 

	UPDATE 

		ALL BY NAME

WHEN NOT MATCHED THEN 

	INSERT

		ALL BY NAME;

That's it! No mapping lists. No worries on column sequence order. As long as source and target share exact column names, the merge just works.

Cheers, happy ending... wait, not quite yet! 😄

Real-World Challenge – Duplicate Source Rows

Well, that’s not the whole story in real-world scenarios. Every data engineer knows you can’t fully trust source data that claims to have no duplicates, that’s a nightmare waiting to break your pipelines, if you are not prepared!.

Most of the time, upstream sources send multiple rows for the same ID - especially when data is aggregated from various systems and combined together.

It will end up something like Multiple source rows match and are attempted to be merged into the same target row.

Additional Scenarios to Watch

Schema Drift

Matching by name won’t work if source has extra columns or target has the missing ones. Always validate schemas before merging.

That's another whole different topic to discuss. May be, another blog post on this specific topic later!

CDC Pipelines

For change-data-capture tasks, handle the deletes explicitly:

Explicitly, handle it in the Merge statement based on source records delete status.

WHEN MATCHED AND s.operation = 'DELETE' THEN DELETE

🔑 Key Lessons

MERGE ALL BY NAME eliminates tedious column-mapping, saving hours in maintenance.
Ensure source and target schemas align - column names must match exactly for seamless merge operation.
Always deduplicate or aggregate source data to prevent merge conflicts.
In production, plan ahead for schema drift and CDC logic - handle them explicitly to keep pipelines resilient.

Snowflake’s evolution is making data engineering more efficient. Still, the responsibility lies with us to design pipelines that handle edge cases and keep production reliable.

#Snowflake #DataEngineering #ELT #SQL #CloudDataWarehouse #SnowflakeRelease #DataPipeline #DataArchitecture #DataOps #SchemaDrift #ChangeDataCapture #CDC #MergeByName #ETL #AzureData #RetailAnalytics #dbt #DataEngineerLife

To view or add a comment, sign in

How to Solve a Complex Data Engineering Merge Problem with #Snowflake “MERGE ALL BY NAME”

Karthikeyan Shanthakumar

How it used to be on the Old Way

The Snowflake Cloud's New Way – MERGE ALL BY NAME

Real-World Challenge – Duplicate Source Rows

Recommended by LinkedIn

The solution was to deduplicate before merging:

Additional Scenarios to Watch

Schema Drift

CDC Pipelines

🔑 Key Lessons

More articles by Karthikeyan Shanthakumar

Others also viewed

Navigating Data Migration and Transformation with DBT (Data Build Tool): Challenges and Solutions

Mastering DBT for Advanced Data Transformation Workflows

A Perfect Couple for Your Data Ingestion: dbt & GX

Data Lake Architecture - from practitioner's perspective

The Shifting Focus of Data Engineering: A Journey Towards Accessible Analytics

Exploring Data Build Tool (DBT): A Personal Insight

Six Steps to Build an End-to-End Data Pipeline

Week 1 of My Data Engineering Journey (Phase 2): Data Modeling and Analytical Design

Day 3: Data Modeling for Scale - Star, Snowflake, and the Denormalization Debate

Explore content categories

How it used to be on the Old Way

The Snowflake Cloud's New Way – MERGE ALL BY NAME

Real-World Challenge – Duplicate Source Rows

Recommended by LinkedIn

The solution was to deduplicate before merging:

Additional Scenarios to Watch

Schema Drift

CDC Pipelines

🔑 Key Lessons

More articles by Karthikeyan Shanthakumar

Snowflake Feature That Feels Like Magic ❄️: Search Optimization Service (SOS)

Automatically Assigning Random Product IDs in Snowflake Using ARRAY_CONSTRUCT + UNIFORM() + Random()

Building an AI-Native Data Pipeline in Snowflake: Snowflake AI LLM Summaries, Snowpark Python, Snowflake dbt-core, and Synthetic Data Generation

Snowflake Cloud UI Refresh - August 2025

Snowflake Query Runtime Estimates

Mastering SQL Window Functions: Analyzing Sales Trends Over the Last 4 Quarters – Part 1

Snowflake Large Warehouse vs. Snowpark Medium Warehouse: A Comparative Guide

Snowflake's New Higher-Order Functions

Efficient Wildcard Search with Snowflake’s ILIKE ANY Operator

MongoDB Atlas to Snowflake using Azure Data Factory

Others also viewed

Navigating Data Migration and Transformation with DBT (Data Build Tool): Challenges and Solutions

Mastering DBT for Advanced Data Transformation Workflows

A Perfect Couple for Your Data Ingestion: dbt & GX

Data Lake Architecture - from practitioner's perspective

The Shifting Focus of Data Engineering: A Journey Towards Accessible Analytics

Exploring Data Build Tool (DBT): A Personal Insight

Six Steps to Build an End-to-End Data Pipeline

Week 1 of My Data Engineering Journey (Phase 2): Data Modeling and Analytical Design

Day 3: Data Modeling for Scale - Star, Snowflake, and the Denormalization Debate

Explore content categories