Microservices and SASE
Its time to catch up on everything that has changed in the world while my head was buried in Zscaler! As I get my hands dirty… I am looking forward to a bunch of Aha! moments, some Hmm? … And possibly also some … really?!
So this is my first blog on Microservices, Docker and Service Chaining. Definitely an Aha! but also some head-scratchers (If you are not a tech geek, just read the lines in bold :)
A bunch of companies seem to be working on a "highly efficient microservices based architecture" to build SASE. Since Docker and I were not closest of friends (had not touched code in a while!), I took it upon myself to get educated in the ways of containers and microservices (and serverless... For a future blog) …
And so the first Aha! ... Docker IS super cool. It is an application developers dream, and provides amazing R&D efficiency.
I could compose an application with a fairly secure network overlay connecting various components within a single day! Combined with AWS Fargate: writing and hosting applications is hundreds of times easier and extremely fast. What provided real geek excitement: On my Mac, I could cross compile a Go program into a linux binary and run it with just a Makefile (yes .. Its what I know :)) without installing ANY tool chains!
Having gotten Docker happy, I figured I should see what it takes to build a SASE inline service. And that’s where I have to do some head-scratching.
If I want to construct a typical network stack that Gartner SASE needs, each TCP or UDP flow has to go via many and varied services. That requires packets to pass from one microservice to another carrying with them state information of the outcome of the previous service. For example, the first microservice may establish identity (either user or service) for a flow, and then the next microservice does DPI in the context of that user or service.
I know I can spawn all these services in an instant with Docker. However the big question is what is the cost of message passing using Docker overlay networks. And it looks like it is CRAZY. It looks like Docker networking needs about 4X CPU cycles of just a native Linux TCP stack with sockets (which itself is not the fastest thing one can build). Finding that astonishingly suboptimal, I looked up Google. I found a 2019 paper in Usenix that seems to show the same outcome. ("Tackling Parallelization Challenges of Kernel Network Stack for Container Overlay Networks". Jiaxin Lei et.al. Published at HotCloud 2019)
iPerf is a common tool used to measure packet and throughput performance in unix. The researchers showed that iPerf gets only about 1/5th the throughput with docker overlay for a single connection. However, since Docker is enables near instant horizontal scale, the way to over come the total throughput or packets per second issue is by creating many parallel microservices. However, here in lies the fallacy of designing SASE based on just horizontal scaling.
Throughput and Latency are not the same. Latency is how long a single packet takes to traverse through the services. This does not change no matter how many extra instances you spin up. Instances only create more capacity to handle more packets in parallel. But when a user is waiting for a file to download, or a server is waiting or an API call to return, they are waiting for that one transaction. Spinning up instances does not make that faster. The docker stack will still take 4X the time of just regular Linux to pass packets through each microservice.
Recommended by LinkedIn
I am surprised by this ineffeciency. My conjecture is that the open source community has been working hard to make docker super efficient for hosting applications, updating applications, instantly scaling them to handle millions of users. That is the big market. Very few are thinking about making containers scale for inline network processing. My experiment did not use SRIOV. But l feel that is also geared for a container running a webserver receiving packets at line rate. It is not natively designed for service chaining multiple services as an inline network service (that neither consumes, nor generates data). If some of you have already figured this, do send me a note with your views.
So my conclusion as yet for using Docker for inline network processing:
Docker enables fast development, and instant horizontal scaling. You can throw more CPUs and push many more packets, but the time it takes for each packet to get thru the stack is still 4X of just a standard Linux socket. It also means 4X the CPUs than those for an optimized network stack. That would potentially put margin pressure for those services.
So an "efficient" microservices based SASE can definitely scale horizontally to support millions of users using millions of CPUs. But if those users truly use all the services…
... All million of them will still be waiting for that page to load.
So Docker for building applications : 100% AHA!
Docker and Microservices for massive efficiency in engineering R&D : 100% AHA!
Docker and Microservices Chaining for network services for SASE ? HMM!
Manoj, this is great!
Manoj Apte - great article very interested to hear what you think of serverless compute and waiting for that first - ohh $hit moment..
Iperf and sriov bring back memories. Want to try your hand at nginx docker with eks? 😉
great Blog!