Automatically Assigning Random Product IDs in Snowflake Using ARRAY_CONSTRUCT + UNIFORM() + Random()

Karthikeyan Shanthakumar

Published Dec 8, 2025

In many data engineering workflows, especially during data simulation, testing, or backfilling incomplete records, you may need to assign a default product identifier or any such identifiers when the source system leaves it blank.

Snowflake makes this seamless by combining arrays with built-in random number functions.

In this post, we break down a simple, powerful SQL pattern for filling missing product_id values using a predefined list of product codes.

Here’s the query:

🔍 What This Query Does

This update statement:

✔ randomly selects one product ID from a list

✔ fills it into rows where product_id is NULL

✔ preserves existing non-null product IDs

This is perfect when you need quick synthetic product mapping or want product variety in a demo or test dataset.

🧩 Breaking Down the Logic

1. Define the set of possible product IDs

ARRAY_CONSTRUCT(9000,9001,9002,9003,9004)

This creates an in-line array:

[9000, 9001, 9002, 9003, 9004]

No lookup table, No CTEs - the list lives directly in the statement.

2. Select a random index using UNIFORM()

UNIFORM(0, 5, RANDOM())

This returns a random integer between:

0 (inclusive)
5 (exclusive)
So the possible index values are: 0, 1, 2, 3, 4

which match the array’s positions.

3. Use the index to pick a product ID

ARRAY_CONSTRUCT(...)[index]

Examples:

index 0 → 9000

index 1 → 9001

index 4 → 9004

Every row gets one random product ID.

4. Convert numeric to string

If your product_id column stores text values, Snowflake ensures correct type via:

TO_VARCHAR(...)

5. Only update missing product IDs

WHERE product_id IS NULL

This safely prevents overwriting existing values.

🚀 When to Use This Pattern

This technique is great for:

Synthetic product data creation
Backfilling missing product info
Demo datasets for downstream teams
A/B test distributions
Mock catalog assignments
Quick randomization in Snowflake exercises

🧠 Why This Pattern Is Clean & Powerful

No staging tables
No temporary functions
No stored procedures
One simple, expressive SQL query
You choose what random values with bounding list

Snowflake’s functional SQL style lets you keep logic inline, readable, controlled list of values and performant.

To view or add a comment, sign in

More articles by Karthikeyan Shanthakumar

Snowflake Feature That Feels Like Magic ❄️: Search Optimization Service (SOS)

Dec 15, 2025

Snowflake Feature That Feels Like Magic ❄️: Search Optimization Service (SOS)

Most Snowflake performance issues aren’t solved by scaling-up your warehouse. They’re solved by reducing how much data…
Building an AI-Native Data Pipeline in Snowflake: Snowflake AI LLM Summaries, Snowpark Python, Snowflake dbt-core, and Synthetic Data Generation

Nov 25, 2025

Building an AI-Native Data Pipeline in Snowflake: Snowflake AI LLM Summaries, Snowpark Python, Snowflake dbt-core, and Synthetic Data Generation

Modern data teams are under increasing pressure to integrate Large Language Models (LLMs) into analytics platforms…
How to Solve a Complex Data Engineering Merge Problem with #Snowflake “MERGE ALL BY NAME”

Oct 13, 2025

How to Solve a Complex Data Engineering Merge Problem with #Snowflake “MERGE ALL BY NAME”

Snowflake recently introduced or I should say, they enhanced the Merge statement to include ALL BY NAME options. So, I…
Snowflake Cloud UI Refresh - August 2025

Aug 22, 2025

Snowflake Cloud UI Refresh - August 2025

Snowflake just rolled out a major Cloud UI redesign this August, and it’s more than a visual refresh—it’s a…

1 Comment
Snowflake Query Runtime Estimates

Jun 9, 2025

Snowflake Query Runtime Estimates

Have you ever needed to estimate the execution time of a Snowflake query? I see them in #Bigquery and glad to see this…
Mastering SQL Window Functions: Analyzing Sales Trends Over the Last 4 Quarters – Part 1

Mar 17, 2025

Mastering SQL Window Functions: Analyzing Sales Trends Over the Last 4 Quarters – Part 1

Introduction When analyzing sales data, it's crucial to track performance trends over time. In this blog, we'll explore…
Snowflake Large Warehouse vs. Snowpark Medium Warehouse: A Comparative Guide

Feb 12, 2025

Snowflake Large Warehouse vs. Snowpark Medium Warehouse: A Comparative Guide

Snowflake offers a variety of compute warehouses with different strengths. This guide compares Large Regular Warehouses…
Snowflake's New Higher-Order Functions

Jan 13, 2025

Snowflake's New Higher-Order Functions

Snowflake’s New Higher-Order Functions In a competitive data analytics industry, it's necessary for efficiency and…
Efficient Wildcard Search with Snowflake’s ILIKE ANY Operator

Oct 15, 2024

Efficient Wildcard Search with Snowflake’s ILIKE ANY Operator

In a recent project, I faced a challenge while filtering certain keywords using wildcard notations in SQL. Typically…
MongoDB Atlas to Snowflake using Azure Data Factory

Jun 23, 2022

MongoDB Atlas to Snowflake using Azure Data Factory

Move your MongoDB data using ADF to Snowflake! I want to move some data from MongoDB Atlas. So, I created a data…

See all articles

🧩 Breaking Down the Logic

Examples:

🚀 When to Use This Pattern

🧠 Why This Pattern Is Clean & Powerful

More articles by Karthikeyan Shanthakumar

Snowflake Feature That Feels Like Magic ❄️: Search Optimization Service (SOS)

Building an AI-Native Data Pipeline in Snowflake: Snowflake AI LLM Summaries, Snowpark Python, Snowflake dbt-core, and Synthetic Data Generation

How to Solve a Complex Data Engineering Merge Problem with #Snowflake “MERGE ALL BY NAME”

Snowflake Cloud UI Refresh - August 2025

Snowflake Query Runtime Estimates

Mastering SQL Window Functions: Analyzing Sales Trends Over the Last 4 Quarters – Part 1

Snowflake Large Warehouse vs. Snowpark Medium Warehouse: A Comparative Guide

Snowflake's New Higher-Order Functions

Efficient Wildcard Search with Snowflake’s ILIKE ANY Operator

MongoDB Atlas to Snowflake using Azure Data Factory

Explore content categories