How to use Snowflake's cloning feature with Trifacta to have 0 downtime for your EDW data refreshes

Misagh Jebeli

Published Oct 17, 2020

Snowflake calls it “Magic with MetaData” and rightly so. We have found this feature extremely useful and have incorporated it in our Transformation phase when we are cleansing, unifying, wrangling data from Staging and sending it our EDW with Trifacta.

Wanna know how to have 0 downtime for your Snowflake EDW data refresh even though your data pipelines bring in new data all day long and your transformation flows running multiple times per day? USE SNOWFLAKE CLONING feature. I am going over how we at InhabitIQ implemented it:

Create a clone of the table you want to insert data into

This way you end up with 2 identical tables pointing to the same metadata

Have your Transformation process (Trifacta flows in our case) APPEND into your clone table:

Run post load scripts on clone table (retiring old rows for SCD Type 2, setting metadata , ...). We use Snwoflake's scheduled tasks to execute our scripts. In the example query below, we are looking for rows that have multiple current version (IS_CURRENT = ‘TRUE') and then set the flag for the old row to FALSE.

Write back the clone table into your main table

This way we are able to wrangle data all day long with no downtime

Misagh Jebeli is BI Architect at Inhabit IQ. InhabitIQ’s suite of products cover many of the needs in the affordable, single-family, and multi-family residential spaces, student housing, and commercial as well as short-term rental markets in the United States, Canada, and the UK.

To view or add a comment, sign in

How to use Snowflake's cloning feature with Trifacta to have 0 downtime for your EDW data refreshes

Misagh Jebeli

More articles by Misagh Jebeli

Others also viewed

Using Snowflake UDFs to Tame Date Chaos❄️

Fixing Line Charts in Power BI When Using Separate Date and Time Dimensions

Picking the right grain

So where DID the bed capacity column come from?

Snowflake Table Types & Its Use-cases

Understanding a dataset in a Data Warehouse

"Measure" it in real-time or pre-"calculate" it as "column"?

Goals of Data Warehousing & Business Intelligence

Working Around the Pitfalls of Self-Service Data

Explore content categories