Databricks’ Post

View organization page for Databricks

1,200,388 followers

Database branching in Databricks Lakebase uses copy-on-write storage to create fully isolated environments in seconds without duplicating data. Most teams rely on shared staging databases or slow pg_dump copies that drift from production and make testing unreliable. Lakebase branches give every developer, pull request, and CI test run its own isolated environment, with instant point-in-time recovery and programmable ephemeral databases for AI agents through the same API. https://lnkd.in/gR_nVdzp

  • No alternative text description for this image

This is a big shift, bringing git-like branching to data with copy-on-write removes the usual trade-off between speed and isolation which makes testing, ci, and experimentation actually reliable without data drift

The ephemeral databases for AI agents piece is what stands out to me. Every client we work with is trying to figure out how to let agents interact with data safely without risking production, and this solves that at the infrastructure level instead of duct-taping guardrails on top.

Database branching with copy-on-write semantics is a transformative capability for modern data engineering workflows. The ability to create fully isolated environments instantly without data duplication addresses a fundamental challenge in collaborative analytics and AI development. Databricks continues to push the boundaries of what is possible on the Lakehouse architecture.

Nice to see branching applied at the storage layer. Copy-on-write changes the cost model, but the real win is starting every environment from the same point in time. That’s usually where shared staging and dump-based workflows fall apart

Copy-on-write data branching removes the speed-vs-isolation bottleneck, delivering reliable testing, CI, and experimentation while preventing data drift

This is like hearing a utensils cleaner thjnk how kitchen should be run.

Copy-on-write branches for every PR is a practical unlock—realistic database state without shared staging drift makes testing much more trustworthy.

This removes a big bottleneck. Faster, clean environments change dev speed a lot. egtos is seeing similar gains when teams access the right expertise instantly. How’s adoption going so far?

See more comments

To view or add a comment, sign in

Explore content categories