The Infrastructure Trap
Infrastructure by SORA

The Infrastructure Trap

How IaaS Locked Architecture in Place

💡 This is the story of a client who successfully managed their SaaS product for over a decade — scaling features, adapting to market shifts, and serving clients with stability and control. A few years ago, as the product matured and the user base grew, they made a strategic move: adopting an IaaS solution based on VMware ESX virtualization. VMware is a leading provider of virtualization technology, and ESX is its hypervisor platform — enabling multiple virtual machines to run on a single physical server with isolation, flexibility, and efficient resource use. It gave them the flexibility they needed at the time and felt like the right foundation to support further growth. And for a while, it was. They were proud of the solution, satisfied with the provider, and confident the right choice had been made. During the IT audit, the infrastructure setup appeared solid and aligned with the client's needs. Even now, with the benefit of hindsight, it was an appropriate decision for that stage of their growth.

⚠️ But then came the turning point: the IaaS provider announced the end of support for the Virtual Data Center (VDC) offer, based on ESX, the client had built upon.

Normally, this kind of transition isn’t a problem — especially when the provider offers a migration plan, backed by technical documentation and testing. The provider did offer migration support — up to a point. But beyond that limit, things began to break down, and the client quickly found themselves without a truly supported path forward. And they weren’t alone. Many other companies relying on the same IaaS offer found themselves in a similar situation — facing urgent decisions, limited options, and uncertain migration paths.

👉 Perhaps this situation sounds familiar — the kind of quiet pressure that creeps in when platforms evolve faster than your timeline.

👉 Suddenly, the client was no longer in control of the timeline. What had once been a stable foundation became a constraint — one that had to be addressed not on their terms, but under pressure.


What Is IaaS – and Why It Matters to Your Architecture

Infrastructure as a Service (IaaS) is a cloud computing model where a provider (such as AWS, Azure, or others) delivers virtualized computing resources — servers, storage, and networks — as a service.

The advantages are clear:

• No hardware to buy or maintain

• Fast, on-demand deployment

• Pay-as-you-go or subscription-based pricing

• Scalable infrastructure as your needs grow

But IaaS comes with trade-offs:

• Less control over the physical environment

• Performance can vary due to virtualization layers

• Network and security complexity can increase as your system scales

While IaaS offers freedom from physical infrastructure, it also introduces new forms of dependency — ones that can quietly shape or constrain your architecture over time. When relying on a managed platform, you inevitably have to adapt your architecture to fit the evolving directives and constraints of the selected ecosystem.


What Is VMware – and Why It Matters to Your Infrastructure

VMware is a leading provider of virtualization technology used by enterprises worldwide to create, manage, and automate virtual machines on physical servers.

🔧 Key components include:

• ESXi: The core hypervisor installed directly on physical servers

• vSphere: A centralized management platform for virtual environments

• NSX: VMware’s software-defined networking solution

• vCloud Director (VCD): A multi-tenant cloud interface used by many providers

Why businesses choose VMware:

• Highly mature, stable, and performant

• Rich feature set: snapshots, high availability, live migration (vMotion), etc.

• Trusted and widely adopted in enterprise environments

Challenges to be aware of:

• Licensing and infrastructure costs can be high

• Administration is complex and requires deep expertise

• Strong dependency on the VMware ecosystem

VMware offers robust enterprise-grade capabilities — but it also introduces strategic lock-in that can become a limiting factor when evolving your architecture or shifting platforms.


What Was the VDC Offering – And Why It Reached Its Limit

Between roughly 2014 and 2022, some providers offered a Virtual Data Center (VDC) service based on VMware’s enterprise virtualization stack. It gave clients a cloud experience similar to managing their own data center — with the benefits of abstraction and self-service.

🔧 The key components included:

• A vCloud Director console for managing virtual resources

• Virtual networks built on NSX-v and VXLAN

• Full control over VM provisioning, networking, and storage within an isolated tenant environment

It was a popular choice for teams needing dedicated virtual infrastructure without running physical servers.

⚠️ Why it was discontinued:

• NSX-v, the core networking layer used in this offer, was officially retired by VMware

• Providers began migrating customers to NSX-T or other alternative cloud offerings

This shift marked a significant change: organizations that had architected their systems around the VDC model were now facing forced migration — not just operationally, but architecturally.


What Came Next: The Replacement Looked Like the Solution

This wasn’t the end of VMware — it was the beginning of a new era for their ecosystem, and a migration path did exist for those ready to adapt and realign their architecture. The road was already mapped out — the direction was clear, even if the journey required careful navigation.

But It Wasn’t That Simple

At first glance, everything should have been under control. A replacement was proposed for the discontinued VDC service: a managed bare-metal offer based on VMware vSphere Enterprise Plus.

🔁 This solution was based on:

• VMware vSphere Enterprise Plus, providing advanced virtualization features

• Two dedicated physical hosts with 64 GB RAM each

• High-performance SSD storage

• Full support for VMware HA, DRS, and live migration (vMotion)

The advantages seemed compelling:

• Fully managed — no need to administer the hypervisor layer

• High performance from bare metal hardware

• Seamless integration with VMware's vSphere ecosystem

But major limitations quickly appeared:

• No automated migration path from the legacy VDC environment

• No vCloud Director interface — reducing visibility and control

• Networking was rebuilt around VLANs, not VXLANs — meaning legacy VDC workloads couldn’t communicate directly

Naturally, the higher fixed cost — with two hosts required by default — wasn’t a problem… as long as you were willing to ignore the price tag and simply bask in the comforting glow of enterprise-grade performance.

What looked like a smooth upgrade path turned out to be a fundamentally different platform — one that disrupted the client’s architecture, introduced friction, and forced a rethink of what control really meant.

But what was going wrong?

On paper, the transition looked manageable. But under the surface, critical incompatibilities emerged — starting with the network layer. The legacy VDC infrastructure relied on VXLAN overlays managed by NSX-v. These virtual networks were powerful: they supported multi-tenant isolation, flexible internal subnets, NAT, and firewalling — all without touching the physical fabric.

In contrast, the new setup was built on traditional VLANs — simpler, more conventional, but natively incompatible with VXLANs.

The result? No direct communication between machines in the old and new environments, even when they belonged to the same client or internal architecture. The outcome was strict network isolation, unless workarounds were introduced — which added risk, complexity, and operational overhead.

To enable data exchange or a phased migration between the two environments, the only viable workaround was to create a custom "bridge" VM — a virtual machine acting as a network gateway.

This bridge had to be carefully engineered:

• It required two network interfaces — one connected to the VXLAN environment (legacy VDC), and the other to the VLAN side.

• It needed to handle routing, NAT, or file transfers between the two otherwise incompatible zones.

• It had to be secure by design, since it was effectively punching a hole between two network segments that were never meant to communicate.

⚠️ The provider offered no specific support for this setup, arguing (not unreasonably) that its implementation depended too heavily on each client's custom architecture. In short, the responsibility for bridging the gap — literally and figuratively — fell entirely on the client.

As the migration progressed, other complications surfaced — each adding to the operational burden and architectural tension.

⚠️ First, there were public IP address changes. The IPs assigned in the legacy environment were not retained. This required provisioning new addresses and reconfiguring DNS entries, firewall rules, SSL certificates, and access controls across all exposed services — including websites, APIs, and VPNs.

⚠️ There was also a loss of functionality. Moving from vCloud Director to vSphere meant losing several practical features: multiple snapshots, fast cloning, and the graphical network management interface. These changes made administration more technical and less accessible.

⚠️ Finally, without live migration capabilities, each VM had to be shut down, exported, and manually re-imported. This introduced unavoidable downtime and additional post-migration configuration steps.

🎯 One of the most critical gaps was the absence of personalized support. While the provider delivered technical guides, there was no meaningful advisory service included — or only through expensive, separate consulting engagements. This meant that clients were left to rethink and rebuild their architecture on their own, often without a clear view of best practices or constraints.

The result? Confusion, risk-prone decisions, and growing mistrust in the process. The provider’s involvement was limited to base infrastructure support — keeping the platform running — but without any commitment to helping clients adapt their actual systems. For organizations with complex or legacy workloads, this lack of guidance became a source of frustration and, in some cases, avoidable architectural debt.


What You Can (and Can’t) Expect from an IaaS Provider

To be fair, this situation isn’t entirely unexpected. The provider positioned itself as an IaaS vendor, and within that model, its responsibility is primarily focused on delivering and maintaining the underlying infrastructure — not on helping clients design or evolve their systems. From that perspective, the lack of architectural guidance wasn’t a failure, but a reflection of the service scope. Clients are expected to manage their own architectures, deployment strategies, and system design choices — and that’s a reasonable boundary in the IaaS model. But when infrastructure changes introduce deep architectural consequences, the line between hosting and system design gets blurry — and that's where many organizations began to feel the friction.


Our Response: From Infrastructure Constraint to Architectural Clarity

Focusing on Business Needs

Before making any technical choices, the focus shifted to a fundamental question:

👉 What does the business really need — now, in six months, in two years, and beyond?

It’s easy to let infrastructure decisions drive architecture. But long-term flexibility requires flipping the model — from "what can this platform do?" to "what should the system enable for the business?"

Priorities were mapped out across time horizons:

Immediate (0–6 months): Stability, secure migration, minimal service disruption.

Mid-term (6–24 months): Operational agility — faster deployments, better observability, and smoother onboarding.

Long-term (2–5 years): Platform independence, environment abstraction, and architecture resilient enough to evolve without rethinking everything.

This business-first lens helped avoid the trap of overreacting technically.


Simplifying the Stack: From Overprovisioned to Just-Right

🎯 Once the real business needs were clear, so was the reality: a private ESX cluster was no longer necessary. The previous setup — a full data center with 99.2 GHz of CPU, 128 GB of RAM, and 1.8 TB of disk, spread across 25 virtual machines — had grown organically. But not all of it still served a purpose. After reassessing, I determined that only five well-scoped VMs were needed to deliver the same value — with far less operational overhead. To support this new architecture, a dedicated VPN and private internal network were implemented, providing secure isolation without relying on legacy platform features.

🎯 No more complicated overlays or rigid dependencies — just a clean, understandable setup aligned with actual needs.

🎯 This wasn’t a rollback — it was a reset. A return to simplicity, guided by clarity.


Realigning Architecture — and Cutting Costs by Two-Thirds

What made this shift even more significant was the contrast with the migration path initially proposed by the IaaS provider. Their recommended replacement — a managed bare-metal setup — would have increased costs by 25%, compared to the original infrastructure. And yet, after months of testing and partial migrations, no viable, supported solution ever emerged. Compatibility issues, tooling gaps, and operational trade-offs made it clear: this was a more expensive setup that delivered less.

Guided by my role as Fractional CTO, the team took a step back to reassess the situation. We re-evaluated, simplified, and took ownership.

The result?

👉 A clean, stable infrastructure tailored to real needs — running at just one-third the cost of the original setup. This wasn’t about chasing savings — it was about regaining clarity.

👉 And clarity, as this case proved, has a great ROI.


Conclusion: Simpler, Stronger, and in Control

This experience reminded everyone involved that infrastructure isn’t just about structure — it’s about posture. They could have stayed on the path laid out by the provider, accepting complexity and cost as the price of continuity. But instead, they paused, reflected, and realigned.

By stepping away from an overengineered solution and building around actual needs, they not only simplified the infrastructure — they regained control over architecture, cost, and strategic direction.

Today, the system is leaner, clearer, and better positioned to evolve — not because a trend or vendor recommendation was followed, but because the fit was questioned.

And that’s the biggest lesson:

👉 Infrastructure is never neutral. Either you design it — or it designs around you.


🚀 Facing a similar shift in your infrastructure? I help tech teams navigate architectural constraints and regain clarity — before pressure turns into paralysis.

📩 Let’s talk about how an audit or fractional CTO support could unlock your next phase.

Ambrosya

Ambrosya Services

Alexandre Chatton


Excellent breakdown Alexandre Chatton of a hidden risk many leaders overlook: When infrastructure evolves faster than business strategy, architecture becomes a liability instead of an enabler. At Ambrosya, we've seen this challenge repeatedly — especially during forced migrations or platform deprecations. One approach we've found effective: Reframing infra planning through a business-first lens, not a tech stack-first mindset. If you're a tech or business leader navigating similar transitions — or planning ahead to avoid them — we’d be happy to build a strategic approach together. 📩 Need a fresh look at your system’s structure? Let’s talk — we help organizations get unstuck and grow again with confidence. #SoftwareArchitecture #StrategicThinking #TechWithPurpose #TechnicalDebt #FractionalCTO #ITAudit Ambrosya Ambrosya Services

Insightful read, Alexandre — this piece really resonated with themes I’ve been exploring on my own around architectural misalignment and long-term operational risk. Your example of infrastructure becoming a constraint rather than an enabler is a powerful one. I’ve found that many organizations unintentionally build themselves into these traps when short-term efficiency overshadows long-term adaptability. It’s refreshing to see others raising awareness about these systemic challenges. The more we can elevate the conversation around strategic alignment — not just technical execution — the better equipped we’ll be to design resilient, future-oriented platforms. Thank you for this.

"infrastructure is never neutral." 💯

To view or add a comment, sign in

More articles by Alexandre Chatton

Others also viewed

Explore content categories