Data Structures First, or Enjoy Debugging the Apocalypse
Calling Duck a Duck is good sometimes, ignore the feathers or group them by it.

Data Structures First, or Enjoy Debugging the Apocalypse

"In today’s engineering culture, where code is increasingly written for/by AI systems to consume rather than humans to maintain, vibe-driven development is on the rise. Fast, expressive, and iterative, it enables rapid prototyping. However, when left unsupervised, it risks breaking foundational design principles."

This is no longer just about clean code. It is about building systems whose data models are semantically aligned, structurally sound, and computationally efficient. AI systems do not infer intent; they learn from structure.

"Drawing from John Ousterhout’s principles particularly the importance of managing complexity through deep modules, clear abstractions, and minimal surface area this article advocates for a data-first approach to system design. One that prioritizes clarity over cleverness, structure over speed, and long-term simplicity over short-term velocity."

1. DDD Is Not Multi-Layered Architecture

Domain-Driven Design (DDD) is often mistaken for multi-layered architecture. These are not equivalent.

Multi-layered architecture separates technical concerns such as presentation, service, and persistence. DDD, on the other hand, separates conceptual concerns. It models the business domain, not the codebase.

Key DDD principles include:

  • Bounded contexts that isolate models per domain
  • Aggregates that define transactional boundaries
  • Ubiquitous language shared across technical and non-technical stakeholders

A system can be layered and still violate DDD if it lacks semantic cohesion. DDD is about modeling meaning and encapsulating complexity behind simple interfaces. As Ousterhout puts it, “a good module hides complexity and presents a simple interface.


2. Relational Modeling: Foundational but Not Universal

Relational modeling is a cornerstone of most systems, but it is not always the right abstraction.

Hierarchical or graph-based models are often more appropriate for recursive or dependency-heavy domains. Over-normalization can degrade performance and increase cognitive load. Generic schemas tend to fail in systems with multiple bounded contexts.

Relational design should be driven by domain semantics. In DDD, schemas emerge from the domain model. In Data-Oriented Design (DOD), layout is optimized for access patterns.

"The most important goal of software design is to manage complexity." ~ John Ousterhout

3. Data-Oriented Design: Optimize for Access

DOD focuses on how data is laid out and accessed, rather than how it is abstracted.

It emphasizes:

  • Contiguous memory layouts for cache efficiency
  • Reduced indirection to minimize latency
  • Modeling based on actual access patterns

DOD complements DDD. While DDD models meaning, DOD optimizes mechanics. Together, they reduce change amplification, where small changes ripple across the system.


4. Algorithm–Data Structure Symbiosis

Algorithms are only as effective as the data structures they operate on.

Examples include:

  • Graph traversal requires graph models
  • Range queries require trees or indexes
  • Dependency resolution requires directed acyclic graphs

Choosing an algorithm before understanding the data model is premature. Structure enables strategy. Optimization without modeling leads to inefficiency.


5. Operational Context Shapes Structure

Before finalizing architecture or algorithms, it is essential to understand:

  • Who consumes the data
  • What the access patterns are
  • What consistency and latency requirements exist
  • Whether the data is immutable, versioned, or ephemeral

These considerations inform decisions such as CRUD versus CQRS, event sourcing versus snapshots, and caching strategies. Good design minimizes cognitive load and begins with accurate data modeling.


6. Architecture Emerges from Data Contracts

Patterns such as dependency injection and service composition are only effective when data contracts are stable and explicit.

Abstracting behavior without understanding data leads to brittle systems. Composability depends on deterministic inputs and outputs.

Ousterhout advises designing “deep modules” that do a lot behind a simple interface. In most systems, that interface is the data contract.


Why This Matters More in the AI Era

As systems increasingly generate data for AI models, structural clarity becomes critical.

AI models are mathematical abstractions. They require consistent, sparse, and well-typed data. Fixed schemas with empty fields are more learnable than variable schemas with noisy data. Poor modeling introduces bias, instability, and poor generalization.

"In the AI era, data modeling is model engineering. The structure you define becomes the substrate for learning."

Coders, Slow Down to Scale Up

"If you are building for scale, for AI, or for systems that will outlive your sprint cycle, model your data before you model your services. Choose structures that reflect semantics, not just syntax. Let architecture emerge from clarity, not convenience. Design deep modules with minimal surface area."

Do not just vibe. Design. The systems you are building today are the training data for the models of tomorrow.

To view or add a comment, sign in

More articles by Bhavya Teja R

Others also viewed

Explore content categories