Transient Database in Snowflake
Transient tables in #snowflake are a weird specialty: They are normal persisting tables (unlike temporary tables), but without fail-safe and limited #timetravel capabilities.
However, because Snowflake doesn't apply fail-safe and time travel capabilities to them, they can be cheaper than normal permanent tables. You probably won't recognize the difference for small tables, but for a table larger than a TB, this can make a difference.
When to use transient tables?
Let me quote Serge Gershkovich here:
Transient tables are a good choice when backup and recovery are not a high priority - as in the case of staging tables. The same is true for development and test environments [...]
(Gershkovich, Serge: Data Modeling with Snowflake, 1st edition, Packt Publishing Ltd., 2023, p.51)
Transient database and schema
You probably don't want to remember adding the transient keyword to each create-table statement if you are working on a staging or dev database/schema. And you don't have to, because you can default new tables in a database/schema to be transient and just create the tables the same way you would create permanent tables:
Recommended by LinkedIn
But my staging/dev database/schema already exists!
Unfortunately you can't alter a database/schema/table to become transient. But there is a neat little workaround: zero-copy-cloning
Cloning will effectively re-create all the objects in a database/schema, not just tables, but also views, procedures, etc. - but the latter are not affected by the change at all. While re-creating the tables, though, the database/schema defaults to transient, and hence all permanent tables effectively are modified to become transient:
create transient database sandbox_tmp
clone sandbox;
alter database sandbox_tmp
swap with sandbox;
drop database sandbox_tmp;
If you then check the DDL of any table in the database, you'll recognize them all to be transient now 🥳
Caution!
Snowflake has several such small things which literally solved several usecases which I thought Oracle will fix being in this area from decades. Snowflake users should use all its features. Developers will also enjoy implementing it instead of lift and shift.
Bravo Martin. Very well explained and practical!
I can recommend to use transient tables/schemas/databases if possible. In our mds setup we actually only need time travel/recovery in the ingestion level. If you have scalable full-refresh models you can save many many TB of unnecessary cloud storage.