The Fall from the Cloud
We need to talk about the elephant in the server room.
Enterprise IT teams are currently being crushed by unexpected cloud egress fees, storage bloat, and compute costs. Even with talented FinOps teams and advanced workload automation, some workloads remain shockingly expensive challenging the overall value proposition of your enterprise cloud strategy.
The boardroom conversations are always the same:
But there is one question that, for the last five years, no one has been willing to ask:
"Can we do this cheaper ourselves on-premises and still meet our SLAs and RTOs?"
For a long time, the prevailing wisdom was that moving 100% to the cloud was inevitable. No one wanted to be in the colocation business. The cloud offered incredible flexibility to scale and highly reliable services at what seemed to be a reasonable operational cost.
But the landscape has shifted.
The AI Real Estate Squeeze Enter the AI boom. Hyperscalers are gobbling up high-quality data center space and power, dramatically increasing the cost of renting a cage in a Tier 3 facility. For the first time in 20 years, tech giants are literally planning to build nuclear reactors just to power their future AI data centers.
Twenty years ago, I ran a 600 server colocation cage in Weehawken, NJ. I had 68 racks and 11 Gigs of bandwidth at $22/sq. ft. and $200/month per 20-amp circuit. Today, those prices have likely increased by 10x.
Back then, our primary reason for colocation was networking, we needed massive internet bandwidth and thousands of public IPs. Today? Massive bandwidth can be achieved in almost every metro area directly to your building.
It isn't businesses that are starved for power; it is the mega data centers.
The "Datacenter in a Box" The "Fall from the Cloud" won't be a return to sprawling, conventional on-prem data centers.
Recommended by LinkedIn
It will be a shift toward powerful, distributed edge computing: the "Datacenter in a Box."
Imagine a half-height rack featuring fully redundant compute, storage, and networking, with redundant power paths—all for less than a $100k capital expense. Over the past few years, I have rolled these mini data centers out to facilities requiring 24/7 mission-critical operations where performance could never be sacrificed, and where they needed to operate in "island mode" if cut off by a natural disaster. Running that specific scenario in the public cloud would have been completely unaffordable.
But Isn't This Complex to Build and Administer? Not anymore.
The Hard Math Let's look at the numbers. An average workload of 32 cores, 128 GB memory, and standard storage runs around $3,000 per month in most public clouds. Over a 3-year lifecycle, doing it yourself is demonstrably cheaper.
There is power in distributed systems Distributed systems are far more powerful than monolithic systems. Give me a million gamers renting out their GPU a few hours per day vs buying and running a $40k Nvidia H100 GPU anytime. The key is shaping our workloads to be distributed. Centralization can make things simpler, but you will pay that price.
People will argue, "But it doesn't scale!"
My counterpoint: Most enterprise business applications do not scale up or down dramatically within a 3-year period. If you have workloads that do require massive elasticity, keep them in the cloud—but understand you are paying a premium for that flexibility.
The Infrastructure Warrior's Verdict I am a big fan of renting infrastructure when you need it. I will gladly rent Google's Vertex AI for a massive, one-off workload rather than buy the hardware.
But the cloud offers countless services, and the bread and butter are still Compute and Storage. When you have critical, static workloads that must run 24/7, maybe it is time to reinvest in highly reliable, small-scale edge data centers.
Even in a highly political "Cloud First" organization, an Infrastructure Warrior must deliver value and security. Leaving on-premises edge computing off the table will dramatically increase your costs and leave you dangerously reliant on a single cloud vendor.
Sometimes, it's okay to fall from the cloud.
What are you seeing in your environments? Are you repatriating any workloads, or are you still pushing 100% to the public cloud? Let's debate in the comments.
History repeats itself
NOTE: This article was published as stand alone. It was republished as part of the IIW newsletter.
Good article and perspective. In past lives, I've seen companies moving towards a cloud agnostic strategies where workloads could easily be moved between clouds, whether they were private or public. This would allow the organization to optimize for application requirements. Most VM and container based workloads are fairly static as you said and can be orchestrated with similar tools (Ansible, Puppet, Terraform). This also provides opportunities N+X redundancy strategies across clouds for application resiliency. Networking is key to all of this working as is having a place for "owned compute" however the form factors are smaller and more manageable than they once were. Other options to achieving this are "multi-cloud" products from VMWare, Nutanix, or Redhat to name a few, as their hypervisors can run in either environment as well. Makes things easier to manage but with a slight uptick in cost over managing your own orchestration engine.