How to Rethink Data Frameworks

Explore top LinkedIn content from expert professionals.

Summary

Rethinking data frameworks means moving away from rigid, outdated data models and architectures in favor of approaches that adapt to changing business needs and technological advancements. A data framework is simply the structure or system that organizes, stores, and processes information within an organization—rethinking it involves making it more flexible, scalable, and easier to manage.

Embrace adaptive design: Build your data systems to accommodate change, allowing them to evolve as business priorities and requirements shift.
Push processing upstream: Standardize, clean, and transform data at its source to prevent duplicated efforts and reduce maintenance headaches down the line.
Choose the right intervention: Assess whether your current data model needs simple updates, a significant remodel, or a complete rebuild based on its history, business context, and future demands.

Summarized by AI based on LinkedIn member posts

Pooja Jain

Open to collaboration | Storyteller | Lead Data Engineer@Wavicle| Linkedin Top Voice 2025,2024 | Linkedin Learning Instructor | 2xGCP & AWS Certified | LICAP’2022

194,450 followers 2mo
Report this post
When leaders brainstorm over trackers instead of architectures 😅 If only pipelines ran as smoothly as the meetings about how to track them.. As funny as it sounds, this happens way too often in data teams — hours spent debating Jira structures, story points, epics, and subtasks… Meanwhile a pipeline is quietly failing in production. But behind the humor lies an important reminder: → 𝐺𝑟𝑒𝑎𝑡 𝑑𝑎𝑡𝑎 𝑒𝑛𝑔𝑖𝑛𝑒𝑒𝑟𝑖𝑛𝑔 𝑖𝑠𝑛'𝑡 𝑎𝑏𝑜𝑢𝑡 𝑝𝑒𝑟𝑓𝑒𝑐𝑡 𝑡𝑟𝑎𝑐𝑘𝑒𝑟𝑠—𝑖𝑡'𝑠 𝑎𝑏𝑜𝑢𝑡 𝑡ℎ𝑒 𝑟𝑖𝑔ℎ𝑡 𝑡ℎ𝑖𝑛𝑘𝑖𝑛𝑔. Over the years, one pattern stands out: Teams that obsess over tools often under-invest in architecture. And teams that anchor on architecture naturally simplify everything else—tooling, tracking, delivery. Sharing few learnings that have made difference in my data engineering journey to build robust data systems: 1. Think in systems, not tasks Before assigning story points, ask: → What domain does this belong to? → What data contracts govern it? → Is this transformation even necessary? Clear system thinking > endless subtasks. 2. Architecture over trackers A well-defined: → Data model → Lineage flow → Orchestration pattern → and error strategy removes 80% of ticket back-and-forth. Your Jira gets simpler because your architecture is clearer 3. Invest in observability early Strong quality checks, lineage, and alerts mean: → Faster debugging → Better collaboration → No 2 AM firefighting Observability is invisible until you desperately need it. 4. Document why, not just what Trackers show what you did. Architecture docs explain why. Future you will thank present you. 5. Reduce cognitive load → Simplified schemas. → Modular pipelines. → Automated steps. Less time deciphering = less time debating story points. Maturity isn't measured by tracker maintenance — it's measured by systems that don't require constant firefighting. Here's what separates good data engineers from great ones- → Ask "what breaks if this fails?" before writing code → Think in layers, not monoliths → Build systems their junior teammates can debug → Optimize for the team inheriting their work, not just shipping fast → Know when NOT to over-engineer, Right-sizing matters more than resume-driven development → Understand that 99% vs 99.9% uptime isn't a rounding error—it's millions in cost 👉 Remember: Your Jira board doesn't run your pipelines. Your architecture does. Spend your energy accordingly. 𝗕𝘂𝗶𝗹𝗱 𝗮𝗿𝗰𝗵𝗶𝘁𝗲𝗰𝘁𝘂𝗿𝗲 𝘁𝗵𝗮𝘁 𝘀𝗰𝗮𝗹𝗲𝘀, 𝗻𝗼𝘁 𝗲𝗻𝗱𝗹𝗲𝘀𝘀 𝗺𝗲𝗲𝘁𝗶𝗻𝗴 𝘁𝗮𝗹𝗲𝘀.
No more previous content

No more next content
49 Comments
Like Comment
Sean Falconer

AI @ Confluent | Technology Executive | Advisor | ex-Google | Podcast Host for Software Huddle and Software Engineering Daily

12,608 followers 1y
Report this post
We’ve built a system where every team hacks together their own data pipelines, reinventing the wheel with every use case. Medallion architectures, once a necessary evil, now feel like an expensive relic, layers of redundant ETL jobs, cascading schema mismatches, and duplicated processing logic. Instead of propagating this mess downstream, shift it left to the operational layer. Do schema enforcement, deduplication, and transformation once, at the source, rather than five times in five different pipelines. Push processing upstream, closer to where the data is generated, instead of relying on a brittle patchwork of batch jobs. Adam Bellemare’s InfoQ article (link below) lays it out clearly: Multi-hop architectures are slow, costly, and error-prone. They depend on reactive data consumers pulling data, cleaning it, and shaping it after the fact. The alternative? Treat data like an API contract. Push standardization into the producer layer. Emit well-formed, semantically correct event streams that can be consumed directly by both operational and analytical systems, without the usual ETL contortions. The old way, letting every team fend for themselves, writing brittle ETL for a dozen variations of the same dataset, creates a maintenance nightmare and is unfair to the data teams that get stuck with disentangling the mess. Shift left. Make clean, high-quality data a first-class product, not an afterthought. No one studied computer science so they could spend their work life cleaning data. So, why are we still defending architectures built for the constraints of 20 years ago? Check out Adam's article for more on this: https://lnkd.in/g27m5ZwV
No more previous content

No more next content
65 Comments
Like Comment
Dr. Sebastian Wernicke

Driving growth & transformation with data & AI | Partner at Oxera | Best-selling author | 3x TED Speaker

11,872 followers 8mo
Report this post
Instead of building data systems optimized for stability, we need systems architected for change. The conventional wisdom about modern data governance sounds reassuringly methodical: treat data like a product, build for reuse, establish clear ownership, and watch costs fall as adoption rises. It's the kind of advice that feels both obvious and actionable—and yet often falls short in practice for one simple reason: By the time you've built the perfect solution for today's requirements, tomorrow's requirements have already changed. Data use cases refuse to behave like the static requirements that common frameworks assume. Yet most scaling advice treats use cases like building blocks that can be stacked methodically, one after another. In reality, they're more like living organisms—constantly evolving as new data sources emerge, regulations shift, business models pivot, and technologies create possibilities that didn't exist even a month ago. This volatility breaks down the many nice-sounding and seemingly logical data management frameworks because striving for "reusable data"—often celebrated as the holy grail of efficiency—carries hidden costs that are often overlooked. Every shared dataset embeds assumptions about how it will be used. Every supposedly universal model introduces integration debt as teams bend it to their specific needs. The push for reuse can easily generate as much complexity as it eliminates, creating rigid systems that optimize for yesterday's requirements. Generative AI intensifies these dynamics rather than resolving them. Yes, it makes engineers more productive at building pipelines, but it also democratizes data exploration. Non-technical teams can now articulate needs directly and prototype solutions in hours instead of weeks. This accessibility is powerful, but it also means the volume and variety of data demands will explode. If anyone can generate transformations on demand, then static datasets become less valuable than dynamic capabilities. What emerges from this is the need for a new design principle: Optimizing for constant change and adaptability. This means creating adaptive feedback loops that can test, validate, and retire data pipelines and use cases as quickly as they appear. It means treating consolidation and pruning with the same discipline we apply to creation. Most importantly, it means recognizing that in a world of moving targets, adaptability itself becomes the product. Scaling data isn't about building better pipelines. It's about building organizations that can bend without breaking—where change isn't just managed, but mastered.
No more previous content

No more next content
54 Comments
Like Comment
Veronika Durgin

9,089 followers 9mo
Report this post
Does Your Data Model Need a New Coat of Paint or a Total Gut Job? Data models aren’t static. They evolve as the business changes, adapting to new priorities, ways of working, technologies, and architectural patterns. But when someone new steps into an existing environment, the urge to rip everything out and start fresh can be strong. I’ve been there. Before reaching for the wrecking ball, it’s important to pause and understand what’s already there and why it was built that way. Every odd or confusing piece of architecture usually has a reason (good or not so good). What looks messy now might still be valuable, and tearing it down without context risks breaking something that matters. So before making big changes, we owe it to ourselves to learn the history and intent behind the model. Once we understand the “why” behind what exists, we can make better decisions about how to move forward. 🛠️ Renovate: Sand It Down, Paint It Green Renovation is the lightest lift. The existing data model still works but just needs some love. Renovate when: 🔹 Logic is sound, but a little hard to follow 🔹 Naming is messy 🔹 We’re prepping for new tooling or capabilities Think: add documentation, simplify joins, standardize calculations, etc. We’re not changing the layout of our “data house,” just making it nicer to live in. 🧱 Remodel: Knock Down a Wall, Open Up the Space Remodeling means keeping the core idea but rethinking how it’s structured. Not starting over, but admitting what worked five years ago isn’t working now. We might be ready to remodel if: 🔹 The model technically works but no one uses it 🔹 The business evolved and the data hasn’t caught up 🔹 Metrics differ across tools and teams 🔹 We are shifting platforms (e.g., Informatica to dbt) This is a bigger lift and will break things. But it’s worth it when confusion outweighs clarity. Like redoing the kitchen: dust now, enjoyment later. 🔨 Rebuild: Level It, Lay a New Foundation Sometimes a data model is beyond repair. No one trusts it, no one knows how it works, and fixes make it worse. Time to rebuild. Needed when: 🔹 The model is fundamentally wrong or full of legacy hacks 🔹 You're are doing a full data stack upgrade 🔹 You've patched it, and now it’s worse 🔹 You’re designing for AI-first features This is a major project requiring stakeholder buy-in, a clear plan, and real business reasons. Rebuilding because “it feels cleaner” wastes months and leads nowhere. Know what you’re dealing with, respect the work before, and choose the right intervention based on context. Tearing down something to promote your own agenda wastes time, money, and trust. Wise trade-offs protect resources, sanity, and keep data ecosystems resilient for the long run. #DataStrategy #DataArchitecture #DataLeadership #DataCulture #DataModeling
No more previous content

No more next content
5 Comments
Like Comment
Prukalpa ⚡ Prukalpa ⚡ is an Influencer

Founder & Co-CEO at Atlan | Forbes30, Fortune40, TED Speaker

53,907 followers 1y
Report this post
Data silos aren’t just a tech problem - they’re an operational bottleneck that slows decision - making, erodes trust, and wastes millions in duplicated efforts. But we’ve seen companies like Autodesk, Nasdaq, Porto, and North break free by shifting how they approach ownership, governance, and discovery. Here’s the 6-part framework that consistently works: 1️⃣ Empower domains with a Data Center of Excellence. Teams take ownership of their data, while a central group ensures governance and shared tooling. 2️⃣ Establish a clear governance structure. Data isn’t just dumped into a warehouse—it’s owned, documented, and accessible with clear accountability. 3️⃣ Build trust through standards. Consistent naming, documentation, and validation ensure teams don’t waste time second-guessing their reports. 4️⃣ Create a unified discovery layer. A single “Google for your data” makes it easy for teams to find, understand, and use the right datasets instantly. 5️⃣ Implement automated governance. Policies aren’t just slides in a deck—they’re enforced through automation, scaling governance without manual overhead. 6️⃣ Connect tools and processes. When governance, discovery, and workflows are seamlessly integrated, data flows instead of getting stuck in silos. We’ve seen this transform data cultures - reducing wasted effort, increasing trust, and unlocking real business value. So if your team is still struggling to find and trust data, what’s stopping you from fixing it?

4 Comments
Like Comment
Dylan Anderson

Data & AI Strategy Advisor → I help CDOs and C-suite leaders build AI that’s embedded into how the business operates, not bolted on top of it

52,599 followers 11mo
Report this post
Leaders reference the People, Process & Technology framework, but never use it correctly Moreover, nobody considers the Data component, which is irresponsible today So I thought I would rethink the framework and come up with some new key questions: 𝐏𝐞𝐨𝐩𝐥𝐞👥 1. What people do we need to deliver against our goals? 2. What skills should they have? 3. How do we foster a data-focused culture? 𝐏𝐫𝐨𝐜𝐞𝐬𝐬𝐞𝐬 🔄 1. What processes will allow us to optimise our workflows? 2. What governance structure can help facilitate success? 3. How do we embed data into our decisioning and ways of working? 𝐓𝐞𝐜𝐡𝐧𝐨𝐥𝐨𝐠𝐲 🛠️ 1. What is the role of technology in our strategy and org direction? 2. What tools and technologies do we need to run our business? What do we need to gather insight? 3. How can we optimise technology adoption and usage? 𝐃𝐚𝐭𝐚 📊 1. How does data help us achieve the organisational goals? 2. What data do we need to drive better insights? 3. What is our strategy to use data and ensure it is of high quality? Use these questions to get you started to thinking more holistically about crucial topics, especially about data and data change I wrote about People, Process, Tech and Data last summer, so check out the article on it and how to think more strategically about this important topic!
No more previous content

No more next content
17 Comments
Like Comment
Jesper Lowgren

Agentic Enterprise Architecture Lead @ DXC Technology | AI Architecture, Design, and Governance.

13,686 followers 12mo
Report this post
🔹 From Storage to Strategy: Rethinking the System of Record. In EA 4.0, the System of Record isn’t just a source of truth. It’s the memory of a thinking enterprise. In traditional enterprise architecture, the System of Record was an afterthought: 📁 A static datastore. 📊 A reporting foundation. 🛠️ A compliance checkbox. But in Enterprise Architecture 4.0, the role of the System of Record is transformed. Why? Because in a world of intelligent systems and autonomous agents, data isn’t just queried— It’s interpreted, reasoned with, and acted upon. 👉 That means the System of Record must now be designed as cognitive infrastructure: Governed not just for quality, but for meaning and trust Modeled not just for storage, but for agentic decision-making Integrated not just by pipelines, but by semantic context EA 4.0 architects don’t just catalog data—they design the conditions for memory. If your enterprise can’t remember reliably, it can’t think responsibly. 💬 How are you rethinking your architecture for the age of autonomy? Are you using the System of Record? Have you evolved its traditional meaning into something that supports AI? Please share your insights and experiences below. 🙏. #EnterpriseArchitecture40 #EA40 #SystemOfRecord #DataGovernance
No more previous content

No more next content
12 Comments
Like Comment
Tim Armstrong Tim Armstrong is an Influencer

Director - Mangrove Digital

8,916 followers 1y
Report this post
Had an interesting conversation with a peer recently about data fundamentals. They were focused on advanced analytics and AI, but were missing some basics. I drew them a simple digram on a serviette and explained: "Think of data like a library. First, you need books on the shelves (Availability). Then, you need to be able to reach those books (Accessibility). Finally, the books need to contain the information you're looking for (Utility)." They went quiet. Sometimes the simplest frameworks are the most powerful. Availability - Does it exist, and can we collect it consistently? ⚙️ The presence and readiness of data ⚙️ Whether it exists and can be collected ⚙️ Consistency and reliability of data flow ⚙️ Real-time vs batch implications Accessibility - Can the right people/systems get to it easily? ⚙️ How easily data can be retrieved ⚙️ Security and permissions ⚙️ Technical infrastructure to reach the data ⚙️ Format and compatibility Utility - Does it actually solve our problem, or create value? ⚙️ The usefulness and value of the data ⚙️ Quality, integrity and accuracy ⚙️ Relevance to business objectives ⚙️ Actionable insights potential If you are to overlook any of these, and your data initiatives are likely to struggle. Nail all three, and you've got the foundation for something high potential and powerful. What's your experience? Which of these three tends to be the biggest challenge in your organisation? #DataManagement #ChangeManagement #DataFoundations #DataProducts #DataTransformation #DataGovernance #DataEnablement
No more previous content

No more next content
Like Comment

How to Rethink Data Frameworks

Summary

More in Understanding Model Frameworks

Explore categories