How to Leverage Snowflake ML for E-Commerce Personalization: A Developer’s Guide
In today’s competitive e-commerce landscape, providing personalized experiences for customers is no longer a luxury – it’s a fundamental driver of success. Companies that leverage AI-driven analytics, data science, and machine learning are increasingly outpacing their competitors. Consumers increasingly expect tailored recommendations and dynamic shopping experiences. Enter Snowflake ML
With Snowflake ML, developers and analysts can use plain SQL inside Snowflake to:
In this article, we’ll explore how Snowflake ML enables simple, SQL-driven personalization for modern e-commerce experiences. You’ll see how to bring customer data into Snowflake, segment shoppers based on behavior, and use Snowflake ML to build an intelligent model that predicts high-value customers. Whether you’re a developer building personalization workflows or an analyst improving marketing effectiveness, these hands-on steps will help you get started quickly.
Start by logging into your Snowflake account at Snowflake Web Interface. If you don’t have one or just need a test account to learn how this works, feel free to sign up here for free.
Step 1: Load and Prepare Your Data
We’ll start by creating a small dummy dataset of customer orders.
Run this entire block in a Snowflake SQL worksheet:
-- Step 1.0: Create a database and schema
CREATE OR REPLACE DATABASE DATACLOUDDISPATCHSI;
USE DATABASE DATACLOUDDISPATCHSI;
CREATE OR REPLACE SCHEMA ECOMMERCE;
USE SCHEMA ECOMMERCE;
-- Step 1.1: Create a customer orders table
CREATE OR REPLACE TABLE CUSTOMER_ORDERS (
CUSTOMER_ID NUMBER,
ORDER_ID NUMBER,
ORDER_DATE DATE,
ORDER_VALUE NUMBER(10,2),
PRODUCT_ID NUMBER
);
-- Step 1.2: Insert sample order data
INSERT INTO CUSTOMER_ORDERS (CUSTOMER_ID, ORDER_ID, ORDER_DATE, ORDER_VALUE, PRODUCT_ID) VALUES
(1001,50001,'2023-01-15', 89.99,201),
(1001,50022,'2023-03-02',120.49,305),
(1002,50110,'2023-05-11', 45.00,110),
(1003,50155,'2023-02-19',239.00,402),
(1003,50190,'2023-05-22',130.00,233),
(1003,50201,'2023-06-01', 99.99,110),
(1004,50333,'2023-01-05', 19.99,502),
(1001,50390,'2023-11-11',301.00,900),
(1005,50400,'2023-12-12', 67.50,702);
-- Step 1.3: Verify data
SELECT * FROM CUSTOMER_ORDERS ORDER BY ORDER_DATE;
This gives you enough data to build customer segments and train a simple model. This dataset has repeated customer purchases, varying order values and useful fields for segmentation and ML modeling that we will use.
Use a Snowflake Workspace
If you prefer to load your data using a visual interface instead of SQL, Snowflake Workspaces make it easy to drag-and-drop files (including Excel and CSV) directly into your environment.
This is where you will paste and execute every SQL command in Steps 1–3 of this guide. Workspaces, Notebooks, and other project tools are available as well, but for this tutorial, all work is done in a standard SQL Worksheet.
Step 2: Build Customer Segments Using SQL
Snowflake allows you to integrate with machine learning models to predict customer behavior, suggest products, and recommend promotions. If you’re a developer, you can use Python or R with Snowflake’s Data Science Workspace to deploy a model that takes in customer data and outputs personalized recommendations.
A simple personalization strategy is identifying customers based on their historical behavior. We’ll calculate:
USE DATABASE DATACLOUDDISPATCHSI;
USE SCHEMA ECOMMERCE;
-- Step 2.1: Create customer segments
CREATE OR REPLACE TABLE CUSTOMER_SEGMENTS AS
SELECT
CUSTOMER_ID,
COUNT(ORDER_ID) AS PURCHASE_COUNT,
AVG(ORDER_VALUE) AS AVG_ORDER_VALUE
FROM CUSTOMER_ORDERS
WHERE ORDER_DATE BETWEEN '2023-01-01' AND '2023-12-31'
GROUP BY CUSTOMER_ID;
-- Step 2.2: Inspect customer segments
SELECT * FROM CUSTOMER_SEGMENTS ORDER BY PURCHASE_COUNT DESC;
This CUSTOMER_SEGMENTS table becomes the foundation for your ML model.
Step 3: Train & Deploy a Machine Learning Model (SQL-Only with Snowflake ML)
Snowflake ML lets you train models directly using SQL—no Python or external tools required.
We will:
Step 3.1: Create a Training Table
Before we can train a machine learning model in Snowflake, we need to give the model examples of what we want it to learn. This means creating a table that includes:
In our case, we want the model to identify high-value customers—so we must create a column that tells Snowflake which customers are considered high-value in our historical data. That is exactly what the training table does - it takes the customer segments we created in Step 2 and adds a new column. Snowflake ML then uses this labeled table to “learn patterns” for high-value customers.
-- Step 3.1: Add a target label for modeling
CREATE OR REPLACE TABLE CUSTOMER_SEGMENTS_TRAIN AS
SELECT
CUSTOMER_ID,
PURCHASE_COUNT,
AVG_ORDER_VALUE,
IFF(PURCHASE_COUNT >= 3, 1, 0) AS TARGET_HIGH_VALUE
FROM CUSTOMER_SEGMENTS;
SELECT * FROM CUSTOMER_SEGMENTS_TRAIN ORDER BY PURCHASE_COUNT DESC;
Step 3.2: Train a Classification Model Using Snowflake ML
Now that we have a labeled training table, we can teach Snowflake ML to recognize patterns that indicate a high-value customer. Training a classification model tells Snowflake:
-- Step 3.2: Train the classification model
CREATE OR REPLACE SNOWFLAKE.ML.CLASSIFICATION HIGH_VALUE_MODEL (
INPUT_DATA => SYSTEM$REFERENCE('TABLE', 'ECOMMERCE.CUSTOMER_SEGMENTS_TRAIN'),
TARGET_COLNAME => 'TARGET_HIGH_VALUE'
);
Snowflake automatically trains and tunes the model based on your training table.
(Optional) View metrics:
CALL HIGH_VALUE_MODEL!SHOW_EVALUATION_METRICS();
Step 3.3: Score Customers with the Model (SQL)
Once the model is trained, the next step is to use it to make predictions. In this case, we want the model to look at each customer’s purchase behavior and decide whether they are likely to be a high-value customer.
This SQL command sends each customer’s information (purchase count and average order value) into the model and returns a prediction:
-- Step 3.3: Score customers
SELECT
s.CUSTOMER_ID,
s.PURCHASE_COUNT,
s.AVG_ORDER_VALUE,
HIGH_VALUE_MODEL!PREDICT(
INPUT_DATA => OBJECT_CONSTRUCT(
'PURCHASE_COUNT', s.PURCHASE_COUNT,
'AVG_ORDER_VALUE', s.AVG_ORDER_VALUE
)
) AS MODEL_OUTPUT
FROM CUSTOMER_SEGMENTS AS s;
Recommended by LinkedIn
What is MODEL_OUTPUT?
Snowflake returns the model’s prediction as a VARIANT value (a structured object). You do not run or execute it—it's simply the result Snowflake displays!
To make the prediction easier to use, you can extract just the predicted class (0 or 1).
To extract the predicted class:
SELECT
CUSTOMER_ID,
PURCHASE_COUNT,
AVG_ORDER_VALUE,
HIGH_VALUE_MODEL!PREDICT(
INPUT_DATA => OBJECT_CONSTRUCT(
'PURCHASE_COUNT', PURCHASE_COUNT,
'AVG_ORDER_VALUE', AVG_ORDER_VALUE
)
):PREDICTION:"class"::NUMBER AS PREDICTED_HIGH_VALUE
FROM CUSTOMER_SEGMENTS;
This gives you a clean 0/1 indicator of whether a customer is considered “high-value.”
Step 3.4: Persist Personalized Scores (Optional)
Up to this point, you’ve used the model to generate predictions directly in a query which is great for exploration—but in the real world, you want to store those predictions in a table so they can be reused by dashboards, applications, marketing etc.
The SQL below creates a new table called CUSTOMER_VALUE_SCORES that contains each customer, their purchase behavior and the model’s prediction.
CREATE OR REPLACE TABLE CUSTOMER_VALUE_SCORES AS
SELECT
CUSTOMER_ID,
PURCHASE_COUNT,
AVG_ORDER_VALUE,
HIGH_VALUE_MODEL!PREDICT(
INPUT_DATA => OBJECT_CONSTRUCT(
'PURCHASE_COUNT', PURCHASE_COUNT,
'AVG_ORDER_VALUE', AVG_ORDER_VALUE
)
):PREDICTION:"class"::NUMBER AS PREDICTED_HIGH_VALUE
FROM CUSTOMER_SEGMENTS;
SELECT * FROM CUSTOMER_VALUE_SCORES ORDER BY PREDICTED_HIGH_VALUE DESC;
Now you have a table ready for downstream personalization. You can now consistently reference these scores to target high-value customers, trigger personalized offers, offer recommendations etc.
Step 4: Real-Time Personalization
Now that you have prediction scores for each customer, you can start using them together with real-time activity to deliver even smarter personalization. Real-time behaviors such as:
For more advanced use cases, Snowflake also supports Online Feature Stores, which allow applications (such as websites or recommendation engines) to fetch the latest customer features—like recent clicks, session history, or model-driven scores—in milliseconds. This is ideal for when you need real-time personalization directly inside your application’s user experience rather than on a batch schedule.
Snowflake can ingest this type of streaming data using tools like Kafka, Kinesis, or Event Hubs, allowing you to continuously update recommendations as customer behavior changes.
To keep your personalization data fresh, you can also schedule a Snowflake task to regularly update a recommendations table. For example, here’s a simplified task that runs every hour and refreshes your top product recommendations:
-- Example: Update recommendations periodically
CREATE OR REPLACE TASK PERSONALIZE_RECOMMENDATIONS
WAREHOUSE = COMPUTE_WH
SCHEDULE = 'USING CRON 0 UTC'
AS
MERGE INTO LATEST_RECOMMENDATIONS tgt
USING (
SELECT CUSTOMER_ID, PRODUCT_ID, SCORE
FROM ECOMMERCE.RECOMMENDATIONS_STREAM
WHERE SCORE > 0.8
) src
ON tgt.CUSTOMER_ID = src.CUSTOMER_ID AND tgt.PRODUCT_ID = src.PRODUCT_ID
WHEN MATCHED THEN UPDATE SET SCORE = src.SCORE
WHEN NOT MATCHED THEN INSERT VALUES (src.CUSTOMER_ID, src.PRODUCT_ID, src.SCORE);
This approach allows your applications to always query the latest, most relevant recommendations, enabling a fully dynamic, personalized shopping experience.
Conclusion
Personalization is no longer limited to manual rules or external ML pipelines. With Snowflake ML,you can power end-to-end e-commerce personalization directly inside Snowflake. In this walkthrough, you saw how to:
Most importantly, everything happens in Snowflake—no data movement, no Python setup, and no external services required. This makes it easier than ever for developers, analysts, and data teams to deliver highly personalized shopping experiences.
NOTE: This walkthrough uses SQL and Snowflake ML, but Snowflake also offers additional AI and intelligence capabilities that can enhance personalization use cases at scale.
Want a One-Click Copy of the code to follow along?
Here is the step by step minimal reproducible workflow you can paste into a SQL worksheet:
-- ============================================================
-- E-COMMERCE PERSONALIZATION QUICKSTART (SQL-ONLY)
-- End-to-end example:
-- 1. Create database & schema
-- 2. Load sample orders data
-- 3. Build customer segments
-- 4. Prepare training data for ML
-- 5. Train Snowflake ML classification model
-- 6. Score customers & optionally persist scores
-- ============================================================
----------------------------------------------------------------
-- (Optional) Step 0: Choose a warehouse
----------------------------------------------------------------
-- Uncomment and replace <YOUR_WAREHOUSE> if needed:
-- USE WAREHOUSE <YOUR_WAREHOUSE>;
----------------------------------------------------------------
-- Step 1: Create database, schema, and sample CUSTOMER_ORDERS
----------------------------------------------------------------
CREATE OR REPLACE DATABASE DATACLOUDDISPATCHSI;
USE DATABASE DATACLOUDDISPATCHSI;
CREATE OR REPLACE SCHEMA ECOMMERCE;
USE SCHEMA ECOMMERCE;
-- Create the orders table
CREATE OR REPLACE TABLE CUSTOMER_ORDERS (
CUSTOMER_ID NUMBER,
ORDER_ID NUMBER,
ORDER_DATE DATE,
ORDER_VALUE NUMBER(10,2),
PRODUCT_ID NUMBER
);
-- Insert sample e-commerce data
INSERT INTO CUSTOMER_ORDERS (CUSTOMER_ID, ORDER_ID, ORDER_DATE, ORDER_VALUE, PRODUCT_ID) VALUES
(1001,50001,'2023-01-15', 89.99,201),
(1001,50022,'2023-03-02',120.49,305),
(1002,50110,'2023-05-11', 45.00,110),
(1003,50155,'2023-02-19',239.00,402),
(1003,50190,'2023-05-22',130.00,233),
(1003,50201,'2023-06-01', 99.99,110),
(1004,50333,'2023-01-05', 19.99,502),
(1001,50390,'2023-11-11',301.00,900),
(1005,50400,'2023-12-12', 67.50,702);
-- Quick preview of raw orders
SELECT * FROM CUSTOMER_ORDERS ORDER BY ORDER_DATE;
----------------------------------------------------------------
-- Step 2: Build customer segments (frequency & average order value)
----------------------------------------------------------------
-- Aggregate behavior to create one row per customer
CREATE OR REPLACE TABLE CUSTOMER_SEGMENTS AS
SELECT
CUSTOMER_ID,
COUNT(ORDER_ID) AS PURCHASE_COUNT,
AVG(ORDER_VALUE) AS AVG_ORDER_VALUE
FROM CUSTOMER_ORDERS
WHERE ORDER_DATE BETWEEN '2023-01-01' AND '2023-12-31'
GROUP BY CUSTOMER_ID;
-- Inspect segments
SELECT * FROM CUSTOMER_SEGMENTS ORDER BY PURCHASE_COUNT DESC;
----------------------------------------------------------------
-- Step 3: Prepare training data for Snowflake ML
-- Add a label indicating whether a customer is “high-value”
-- (in this example: 3 or more purchases)
----------------------------------------------------------------
CREATE OR REPLACE TABLE CUSTOMER_SEGMENTS_TRAIN AS
SELECT
CUSTOMER_ID,
PURCHASE_COUNT,
AVG_ORDER_VALUE,
IFF(PURCHASE_COUNT >= 3, 1, 0) AS TARGET_HIGH_VALUE
FROM CUSTOMER_SEGMENTS;
-- View training data with target
SELECT * FROM CUSTOMER_SEGMENTS_TRAIN ORDER BY PURCHASE_COUNT DESC;
----------------------------------------------------------------
-- Step 4: Train a classification model with Snowflake ML
-- This learns to predict TARGET_HIGH_VALUE from the features
-- PURCHASE_COUNT and AVG_ORDER_VALUE.
----------------------------------------------------------------
CREATE OR REPLACE SNOWFLAKE.ML.CLASSIFICATION HIGH_VALUE_MODEL (
INPUT_DATA => SYSTEM$REFERENCE('TABLE','ECOMMERCE.CUSTOMER_SEGMENTS_TRAIN'),
TARGET_COLNAME => 'TARGET_HIGH_VALUE'
);
-- (Optional) Inspect training metrics
CALL HIGH_VALUE_MODEL!SHOW_EVALUATION_METRICS();
----------------------------------------------------------------
-- Step 5: Score customers with the trained model
-- This returns the predicted class (0 = not high-value, 1 = high-value).
----------------------------------------------------------------
SELECT
CUSTOMER_ID,
PURCHASE_COUNT,
AVG_ORDER_VALUE,
HIGH_VALUE_MODEL!PREDICT(
INPUT_DATA => OBJECT_CONSTRUCT(
'PURCHASE_COUNT', PURCHASE_COUNT,
'AVG_ORDER_VALUE', AVG_ORDER_VALUE
)
):PREDICTION:"class"::NUMBER AS PREDICTED_HIGH_VALUE
FROM CUSTOMER_SEGMENTS
ORDER BY PREDICTED_HIGH_VALUE DESC, PURCHASE_COUNT DESC;
----------------------------------------------------------------
-- Step 6 (Optional): Persist personalized scores for downstream use
-- This creates a reusable table that other teams, dashboards,
-- and applications can query.
----------------------------------------------------------------
CREATE OR REPLACE TABLE CUSTOMER_VALUE_SCORES AS
SELECT
CUSTOMER_ID,
PURCHASE_COUNT,
AVG_ORDER_VALUE,
HIGH_VALUE_MODEL!PREDICT(
INPUT_DATA => OBJECT_CONSTRUCT(
'PURCHASE_COUNT', PURCHASE_COUNT,
'AVG_ORDER_VALUE', AVG_ORDER_VALUE
)
):PREDICTION:"class"::NUMBER AS PREDICTED_HIGH_VALUE
FROM CUSTOMER_SEGMENTS;
-- Final scored output
SELECT * FROM CUSTOMER_VALUE_SCORES
ORDER BY PREDICTED_HIGH_VALUE DESC, PURCHASE_COUNT DESC;
📖Continue Learning through the Holidays:
⁉️ Are you a bold voice, creative thinker, and a collaborative connector who wants to shape the future of data?
Become a 2026 Data Superhero - deadline to apply is Friday, December 26, 2025
🚨Missed our sessions from Build 2025? 🚨
Watch keynotes, presentations, and technical sessions from BUILD 2025.
Snowflake What’s striking about this isn’t personalization — it’s the governance choice embedded in the architecture. The Snowflake intelligence model shows that better commerce doesn’t come from more data alone. It comes from deciding in advance what the system is allowed to act on, and what must be deferred when signals are uncertain. Behavioral signals, segmentation, and affinity scores are not just analytics features. They are pre-authorized constraints — boundaries on automated action set before scale and speed kick in. That’s the same logic behind ARC-S (Architectural Risk Control System): when stakes rise, authority must be bounded at design time, not improvised at runtime. Without those bounds, personalization becomes opaque, inconsistent, and hard to challenge. ARC RECORD completes the picture by preserving execution-time facts: what signals were used, what actions were permitted, and what was blocked — not intent, but evidence. Good personalization isn’t just smarter. It’s architecturally constrained. That lesson goes far beyond e-commerce.
Tired of meaningless work? 🔥 So are we. At CoreOps.AI, we build systems that survive production, scale pressure, and real users. Not slide decks pretending to be engineering. Not vanity PoCs. This is where real engineers come to do real work. Why strong engineers choose us: • You solve actual data complexity, not toy problems • You work on AI & GenAI workflows used in production • You face performance bottlenecks & system trade-offs daily • You own decisions end-to-end — ownership isn’t optional here, it’s expected Open Roles — Delhi NCR: • Senior Data Scientist (8–12 yrs) • Data Scientist / GenAI (2–8 yrs) • AI / ML Engineer (4–8 yrs) • Data Engineer (4–8 yrs) • Full Stack Developer (4–12 yrs) • DevOps / MLOps Engineer (4–12 yrs) • Python Developer (2–4 yrs) • Senior Solution Architect (8–12 yrs) 📍 Delhi NCR ⚡ Immediate / Short Notice Period preferred If you’re done with surface-level work and want to build systems that actually run — DM me. Apply here 👉 https://forms.gle/ispdd5HXCiM5s1ia9 #Hiring #TechJobs #DelhiNCRJobs #EngineeringCareers #GenAI #MLOps #DataEngineering #AIJobs
I’m seeing PREDICTED_HIGH_VALUE = 0 even for customers who have 3 purchases, which is also the threshold used for labeling high-value customers.
A very good client use case ❄️ thank you for the explanations 👍
This is a great walkthrough. What stands out is how much friction this removes between data, modeling, and activation. When teams can move from segmentation to deployment without context switching, personalization becomes an execution problem—not an infrastructure one. That’s where real impact starts to show up.