Stack Smackdown: Battle of the backends (A Kiro project)
After learning about Kiro from AWS re:Invent, I was excited to give it a try by building a sample project that was practical. Having built a career in the corporate world by learning from open source projects, I also wanted to contribute something meaningful back to the developer community while I have some time on my hands being in between jobs right now. Right off the bat I must say, I'm not sure I'm ever going back to any other IDE unless I have to. I believe Spec Driven Development (SDD), the development paradigm Kiro forces you into, is truly the next big shift in any developer's journey. It doesn't replace known patterns that are tried and true, such as MVC or Domain Driven Development (DDD); but rather it is complimentary to them, a style of development built for the age of Artificial Intelligence. While other IDEs support a similar style via plugins for Gemini, Amazon Q, etc, to me it feels nicer to work from a fully fledged AI-first next generation IDE dedicated to this style of programming.
How Kiro Stands Out
For one, Kiro has an adorable mascot. It is an animated little ghost figure, reminiscent of some early video games from my childhood.
However what makes Kiro revolutionary:
With the right setup you get a comprehensive end-to-end workflow that really makes a dev's life easier. From the little so far I've worked with Kiro, it's a remarkable and delightful experience that is almost downright addictive. And it's so optimized I was able to build my Stack Smackdown project by using about 1,000 credits (the first 500 were free).
What is Stack Smackdown?
For as long as I can remember in my journey as a software engineer and architect, I've always had a burning question on my mind,
"What is the best language or framework for a backend web service?"
This is the question I try to answer with my Stack Smackdown project.
As many developers know, the answer to this question is different depending on who you ask. It is very subjective and more or less a matter of taste. I've come to realize that although personally I am a huge fan of Kotlin, much of my opinion about what makes Kotlin great rests on the fact that my career sort of took me there by happenstance:
However, it's only much later that I realized much of what I loved about Java and Kotlin can be found in a plethora of other languages. For example Scala also has null safety and is interoperable with Java, that is Scala code can use existing Java libraries. So I imagine other developers' opinion on the best language is a bit like mine in the sense that it's very much rooted in the way their journey simply took them there after one or two other languages just didn't "feel" right. So with this AI-assisted SDD project using Kiro, I sought out a more concrete and empirical way to answer this question. Most importantly I wanted an answer that actually matters in this day and age.
Syntax is Irrelevant
Only conciseness, readability, and performance matter anymore.
When looking at today's landscape, everything runs on the cloud by renting from major providers and platforms like Amazon/AWS, Google/GCP, and Microsoft/Azure. At this point I now understand the most important characteristic of any backend server framework is its overhead in terms of performance and cost effectiveness. Every language, every framework today, provides essentially all the same things only with different syntax. And with the advent of AI, memorizing syntax is becoming less important each day (you still have to learn enough to read the code AI outputs though , debug and give it some direction). Any web service can now be written in any language, can be containerized with Docker and deployed to any cloud environment. So it has dawned on me, what matters most now is the footprint of each language while performing identical work. Container image size impacts what you pay for container registries. Lower RAM and CPU utilization means you can select smaller instances and run identical workloads for cheaper. Code that is more verbose burns through tokens faster in your AI assisted coding.
By teaming up with Kiro, I was able to easily venture off into other languages and frameworks, and begin to understand more deeply which language/framework/runtime gives me more bang for my buck.
The Technical Stack: 11 Frameworks
Stack Smackdown puts 11 languages / web frameworks through a comprehensive performance comparison. This isn't just about picking favorites. It's about understanding the real-world trade-offs that affect your cloud bill.
Native Compilation (Direct to Machine Code):
JVM Ecosystem (Bytecode → JIT Compilation):
Other Runtimes:
Current State: Production-Ready Benchmarking Platform
The platform is fully operational with comprehensive monitoring infrastructure:
Each service implements identical MVC architecture. For simplicity the services currently host a single /health endpoint that only Docker Compose hits for health checks, ensuring a fair baseline comparison across all frameworks while they are mostly idle, just returning a simple json response to a single client every few seconds. In the future I'd like to add some more endpoints for various computational tasks, to compare other things like concurrency, cpu-bound vs i/o-bound tasks, and load.
The beauty of this setup is its simplicity - you only need Docker Desktop to run the entire platform locally and start comparing performance immediately.
Performance Dimensions: What Really Matters
1. Docker Image Size
Why it matters: Larger images mean higher storage costs in container registries, slower deployment times, and increased network transfer costs.
What I measured: Final container image size after optimization, ranging from ~150MB for native binaries to ~300MB+ for full-featured frameworks.
2. Memory Usage (RAM)
Why it matters: Memory is often the most expensive cloud resource. Lower RAM usage means you can run more services per instance or choose smaller, cheaper instance types.
What I measured: cAdvisor's container_memory_usage_bytes metric, which represents the total memory usage from the container's cgroup. This includes:
This gives us a comprehensive view of the container's total memory footprint, not just the application's heap usage. It's the same metric that Kubernetes uses for memory-based pod eviction decisions, making it highly relevant for real-world deployment scenarios.
3. CPU Utilization
Why it matters: CPU efficiency directly impacts your ability to handle concurrent requests and determines how many services you can co-locate on a single instance.
What I measured: cAdvisor's container_cpu_usage_seconds_total metric, processed through a Prometheus rate function. This represents:
This metric comes directly from Linux cgroups (/sys/fs/cgroup/cpu/cpu.usage_us) and measures actual CPU seconds consumed by all processes within the container. It's not measuring different concurrency models (threads vs coroutines vs event loops) - it simply measures total CPU time used regardless of how the application achieves that work.
Important note: This is raw CPU utilization at the container level. A single-threaded application maxing out one core will show 100% utilization, while a multi-threaded application using 2.5 cores will show 250% utilization. The metric doesn't distinguish between different programming paradigms - it just measures total CPU consumption.
4. Build Time
Why it matters: Faster builds mean shorter CI/CD pipelines, quicker developer feedback loops, and reduced infrastructure costs for build systems.
What I measured: Complete build time from source to deployable artifact, including dependency resolution and compilation.
5. Cyclomatic Complexity
Why it matters: This measures code maintainability and testing requirements. Lower complexity means easier debugging, fewer bugs, and reduced long-term maintenance costs.
What I measured: Decision points in the codebase (if statements, loops, switch cases) to assess how much framework boilerplate is required for identical functionality.
6. Application Size (Character Count)
Why it matters: Source code size directly impacts development velocity, maintainability, and AI-assisted development costs. Smaller codebases are easier to understand, debug, and modify. In the era of AI coding assistants, character count also correlates with token usage and API costs for code analysis tools.
What I measured: Total character count across all source code files for each service implementation, measured in bytes. This includes:
Recommended by LinkedIn
This metric reveals framework overhead and language verbosity. A framework requiring 2,000 characters to implement the same functionality as another framework's 500 characters indicates higher development and maintenance costs. It also helps estimate AI development costs, as most AI coding tools charge based on token usage (roughly 4 characters per token).
Practical implications:
This dimension complements cyclomatic complexity by measuring not just code complexity, but total code volume required to achieve identical functionality across all 11 frameworks.
The Results: Data-Driven Insights
The comprehensive benchmarking reveals clear performance patterns that challenge conventional wisdom about backend framework selection. Here's what the data tells us:
Native Compilation Dominates Efficiency Metrics
Rust (Actix-web) emerges as the efficiency champion across multiple dimensions:
Go (Gin) follows closely with excellent resource efficiency:
Dart (Shelf AOT) proves Google's server-side vision with solid performance:
JVM Ecosystem: The Memory-Hungry Powerhouses
The JVM frameworks show a consistent pattern - excellent tooling and ecosystem, but at a significant resource cost:
Scala (Akka HTTP) represents the extreme end:
Java (SpringBoot) shows enterprise framework overhead:
Kotlin (Ktor) offers a middle ground in the JVM space:
.NET Runtime: Enterprise Efficiency
C# (ASP.NET Core) delivers Microsoft's enterprise-grade performance:
Interpreted Languages: Surprising Efficiency
Contrary to expectations, interpreted languages show competitive resource usage.
Node.js (Express) delivers impressive efficiency:
Python (Django) balances features with efficiency:
PHP (Laravel) shows surprising optimization:
Ruby (Sinatra) demonstrates minimalist framework benefits:
The Development Experience Trade-offs
Code Verbosity vs Performance: There's an inverse relationship between code size and runtime efficiency. Rust requires more characters (2,178) but delivers exceptional performance, while PHP achieves the same functionality with minimal code (1,716 characters) but higher deployment overhead.
Build Time Surprises: Go's 12.2-second build time stands out as unexpectedly slow for a compiled language, while Rust compiles to native code in just 1.04 seconds. This suggests Go's build process includes more comprehensive optimization or dependency resolution.
Complexity Patterns: Node.js achieves the lowest complexity (1) through JavaScript's event-driven model, while enterprise frameworks like SpringBoot and Go require higher complexity (14) for the same functionality.
Cloud Cost Implications
Based on typical cloud pricing models:
Most Cost-Effective: Rust, Dart, and Go offer the best resource efficiency, potentially reducing cloud costs by 60-80% compared to JVM frameworks.
Enterprise Sweet Spot: .NET Core (ASP.NET) provides a balanced approach with 25 MB memory usage and enterprise features, making it cost-effective for Microsoft-centric environments.
Development Velocity Leaders: Node.js and PHP minimize code complexity and build times, reducing development costs even if runtime costs are higher.
The data reveals that framework choice significantly impacts both development velocity and operational costs, with native compilation languages offering the best resource efficiency for high-scale deployments.
Testing Environment
Note: results are based on local testing using MacBook Pro M4 (12-core) with Docker containers. Performance characteristics may vary across different hardware and cloud environments.
Experience the Data Yourself
Thank you for joining me on this performance journey! I hope these empirical insights help you make more informed technology decisions - whether you're optimizing cloud costs or choosing your next project's stack.
Ready to explore? The complete Stack Smackdown platform is open source on GitHub. You can:
Found value in this analysis? A GitHub star ⭐ helps other developers discover these performance insights.
Built with Kiro: This entire project showcases Spec Driven Development in action - demonstrating how AI-first development tools can accelerate complex technical projects while maintaining rigorous engineering standards.
Good one Michael Summers .. Neat well written breakdown with analysis. This eventually will turn out to be the unit economics of how developers should understand the cost involved in every architecture and lines of code that is written. FinOps will become an important skill that reflects and differentiates a junior and senior technical fellow.
Good article. Well written ..