Automatically Assigning Random Product IDs in Snowflake Using ARRAY_CONSTRUCT + UNIFORM() + Random()
In many data engineering workflows, especially during data simulation, testing, or backfilling incomplete records, you may need to assign a default product identifier or any such identifiers when the source system leaves it blank.
Snowflake makes this seamless by combining arrays with built-in random number functions.
In this post, we break down a simple, powerful SQL pattern for filling missing product_id values using a predefined list of product codes.
Here’s the query:
🔍 What This Query Does
This update statement:
✔ randomly selects one product ID from a list
✔ fills it into rows where product_id is NULL
✔ preserves existing non-null product IDs
This is perfect when you need quick synthetic product mapping or want product variety in a demo or test dataset.
🧩 Breaking Down the Logic
1. Define the set of possible product IDs
ARRAY_CONSTRUCT(9000,9001,9002,9003,9004)
This creates an in-line array:
[9000, 9001, 9002, 9003, 9004]
No lookup table, No CTEs - the list lives directly in the statement.
2. Select a random index using UNIFORM()
UNIFORM(0, 5, RANDOM())
This returns a random integer between:
which match the array’s positions.
3. Use the index to pick a product ID
ARRAY_CONSTRUCT(...)[index]
Examples:
index 0 → 9000
index 1 → 9001
index 4 → 9004
Every row gets one random product ID.
4. Convert numeric to string
If your product_id column stores text values, Snowflake ensures correct type via:
TO_VARCHAR(...)
5. Only update missing product IDs
WHERE product_id IS NULL
This safely prevents overwriting existing values.
🚀 When to Use This Pattern
This technique is great for:
🧠 Why This Pattern Is Clean & Powerful
Snowflake’s functional SQL style lets you keep logic inline, readable, controlled list of values and performant.