💻 Coming from app development, I took CI/CD for granted – branch, insert a change, review it, deploy it, roll it back if something breaks. But in the world of data engineering, deploying database changes across environments can still be surprisingly manual. I've seen teams stitch together Terraform providers, migration frameworks, and custom scripts just to get something close. I'm not kidding when I say I hear battle scar stories about things like this on a daily basis. Many workarounds existed because data platforms didn't have this built in, but now, modern platforms like Snowflake do. Snowflake DCM Projects let you define the desired state of your Snowflake objects in code – databases, schemas, tables, roles, warehouses, grants, dynamic tables – and Snowflake figures out what needs to change to get there. You can declaratively manage: • Databases, schemas, and tables • Dynamic tables and views • Warehouses • Roles and grants • Tasks • Data quality expectations using Data Metric Functions 𝗬𝗼𝘂 𝗰𝗮𝗻 𝗽𝗿𝗲𝘃𝗶𝗲𝘄 𝗰𝗵𝗮𝗻𝗴𝗲𝘀 𝗯𝗲𝗳𝗼𝗿𝗲 𝘆𝗼𝘂 𝗿𝘂𝗻 𝗮𝗻𝘆𝘁𝗵𝗶𝗻𝗴 𝗮𝗻𝗱 𝗱𝗲𝗽𝗹𝗼𝘆 𝘄𝗵𝗲𝗻 𝗿𝗲𝗮𝗱𝘆. ✅ Definitions also support Jinja templating, so the same source files can 𝗱𝗲𝗽𝗹𝗼𝘆 𝘁𝗼 𝗱𝗲𝘃, 𝘀𝘁𝗮𝗴𝗶𝗻𝗴, 𝗮𝗻𝗱 𝗽𝗿𝗼𝗱 with different parameters per environment. ✅ Deployments are version-controlled and auditable, so you always know exactly what changed and who deployed it. 𝗧𝗿𝗮𝗰𝗸 𝘆𝗼𝘂𝗿 𝗱𝗲𝗳𝗶𝗻𝗶𝘁𝗶𝗼𝗻𝘀 𝘄𝗶𝘁𝗵 𝗚𝗶𝘁 𝗮𝗻𝗱 𝘆𝗼𝘂 𝗵𝗮𝘃𝗲 𝗮 𝗳𝘂𝗹𝗹 𝗖𝗜/𝗖𝗗 𝗽𝗶𝗽𝗲𝗹𝗶𝗻𝗲 for your Snowflake account. Data engineers have been asking for proper SDLC workflows for years – plan, review, deploy, monitor – without bolting on external tools. It's here! Try it out with this guide and let me know what you think 👇: https://lnkd.in/gfbDzQHM
Database Management in CI/CD
Explore top LinkedIn content from expert professionals.
Summary
Database management in CI/CD (Continuous Integration/Continuous Deployment) automates the process of updating and maintaining database changes alongside code, ensuring reliable and secure data workflows during development and production releases. This approach helps data teams catch issues early, streamline deployments, and maintain consistent environments for apps and analytics.
- Automate database workflows: Set up dedicated pipelines to manage schema changes, data models, and access permissions, so updates happen automatically with each code deployment.
- Isolate and test safely: Create separate, temporary database branches for each feature or pull request to test changes with real data, then merge or discard without affecting others.
- Track and monitor changes: Use version control and dashboards to review what’s changing in your database, identify issues early, and maintain a clear record of who made adjustments and when.
-
-
Save a lot of costs with Slim CI + dbt - Here's how to implement it in dbt core! 💸 My open-source repository on implementing CI/CD with dbt (now with over 100 stars ⭐) just got a major update based on community feedback, making it even easier to adopt. What's new? 🔹 Default to Snowflake: The template is now configured for Snowflake out-of-the-box. 🔹 No Cloud Storage Needed: We've moved from GCS to GitHub Artifacts for storing the production manifest.json, simplifying setup and removing the need for a separate cloud bucket. This repository provides a production-ready CI/CD framework using GitHub Actions to run dbt Core intelligently. It only builds what has changed, saving you massive amounts of time and warehouse compute costs. Here’s how the 3 included GitHub Actions work: 1️⃣ ci.yml: When a PR is opened to main, it builds only the modified models and their downstream dependencies in a dedicated, temporary schema for safe testing (using dbt defer). 2️⃣ cd.yml: When the PR is merged, it deploys only those same modified models to your production environment. No more full project builds for a one-line change! 3️⃣ cd_teardown.yml: When the PR is closed, it automatically drops the temporary schema to keep your environment clean. It's a completely free resource. Fork it, adapt it, and supercharge your dbt workflow! Check it out on GitHub: https://lnkd.in/dxY8_AM6 Follow me for daily dbt content 🦾 #dbt #DataEngineering #AnalyticsEngineering
-
Not all problems can be prevented. But problems that can be prevented, should be prevented. That's why we're launching GitLab support for Data CI/CD, one of Metaplane’s most popular features that helps prevent data incidents before they occur. For data teams using dbt Labs and GitLab, here are the 5 key features you'll want to know: 1. 𝗜𝗺𝗽𝗮𝗰𝘁 𝗽𝗿𝗲𝘃𝗶𝗲𝘄𝘀: Identify exactly which downstream tables and objects could break from your merge request, so you don’t accidentally take out the CEO’s favorite dashboard. 2. 𝗧𝗲𝘀𝘁 𝗽𝗿𝗲𝘃𝗶𝗲𝘄𝘀: Compare data changes between production and MR branches to catch regressions, like a revenue metric unexpectedly dipping by 30%. 3. 𝗖𝗜/𝗖𝗗 𝗱𝗮𝘀𝗵𝗯𝗼𝗮𝗿𝗱: Track the result and impact of every historical merge request in one place, making incident retrospectives easier without trawling through different tabs. 4. 𝗙𝗶𝗻𝗲-𝗴𝗿𝗮𝗶𝗻𝗲𝗱 𝗰𝗼𝗻𝘁𝗿𝗼𝗹: Filter by tag, limit downstream testing depth, configure timeouts, and more to match your workflow. Tests should run as cheaply and quickly as you prefer. 5. 𝗖𝗼𝗺𝗲 𝗮𝘀 𝘆𝗼𝘂 𝗮𝗿𝗲: Works with both cloud and self-hosted GitLab instances, as well as both dbt Core and Cloud workflows. The goal is simple: prevent data quality issues before they hit production, not after a customer has been burned. Get started within minutes by reading our blog post and docs linked below 👇. #dataengineering #dataquality #cicd #gitlab #dbt
-
End-to-End CI/CD for Data Engineering with Azure Databricks Building reliable data pipelines goes far beyond just writing code it’s about ensuring collaboration, automation, and governance across every environment: Dev → Test → Prod. This architecture represents a modern Data Engineering CI/CD workflow powered by Azure DevOps, Databricks, and Git. Development Phase Data Engineers collaborate in IDEs like VS Code, IntelliJ, or Eclipse to build and test ETL pipelines. All notebooks, configurations, and scripts are version-controlled in Git Repositories (GitHub, Azure Repos). Azure Databricks (Dev environment) connects with Azure Key Vault for securely managing secrets and credentials. Integration & Testing Phase Once changes are committed, Azure Pipelines automates builds, validation, and deployment to the Test Resource Group. This stage executes smoke tests, data validation checks, and integration tests to ensure pipeline reliability. Teams leverage Azure Databricks notebooks and Azure AD for secure testing and environment isolation. Production Deployment Successful pipelines are promoted to the Prod Resource Group with automated provisioning and configuration management. Key Vault continues to secure credentials, while Azure AD governs user access. Continuous monitoring and logging ensure that production workloads remain stable and optimized. The result, a scalable, secure, and automated CI/CD pipeline that accelerates data delivery, reduces manual overhead, and promotes strong DevOps practices in the data engineering ecosystem. #Azure #Databricks #DataEngineering #CICD #DevOps #DataPipelines #AzureDevOps #GitHub #KeyVault #BigData #Automation #ETL #DataEngineer #C2C #SeniorDataEngineer
-
So, getting comfier with Databricks so you'll be seeing more posts from me about them going forward. 😁 Git-style branching for databases is something developers have wanted for a long time. It's always been architecturally awkward because traditional databases tie compute and storage together on the same instance. You can't branch without either duplicating data or accepting significant overhead. Lakebase changes the underlying architecture. Because it separates compute from storage, copy-on-write database branching is actually possible. Create a branch, get a fully isolated Postgres environment from the exact state of its parent at a point in time. Make changes. Merge or discard. Idle branches scale to zero. The CI/CD story here is honestly the most interesting part. Pull requests can automatically provision a branch with production-like data, run integration tests against it, and tear it down on close. No shared test database state. No flaky tests from someone else's migration running at the same time. For Databricks users building applications on PostgreSQL through Lakebase, this is a substantial improvement in development workflow. The pattern itself, isolated branch-per-PR database environments, is worth understanding regardless of platform. Once teams have it, they don't want to go back. #Databricks #Lakebase #PostgreSQL #DatabaseDev #DataEngineering #CICD #DevWorkflow
Explore categories
- Hospitality & Tourism
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Education
- Leadership
- Ecommerce
- User Experience
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Science
- Supply Chain Management
- Future Of Work
- Consulting
- Writing
- Economics
- Artificial Intelligence
- Employee Experience
- Healthcare
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Career
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development