Overcommitment and Risks on MSA
( This article has been transcribed from the "Designing and Building Solid Microservice Ecosystems", of my authorship, from the "Avoid Overcommitting" recommendations section, p.482 and later )
There is no doubt that the microservice approach helps us to accelerate the digital transformation process, by providing a flexible and robust methodology to build software. One of the characteristics of such an approach is that microservices can be highly specialized, leading to a very fine-grain implementation to the point microservices become very specialized and atomic. Even though a certain grade of specialization is important and required - as it is inherent to MSA -, specialization should be analyzed from the very beginning of a project, when we have enough time to think about domain partitioning and microservice sizing during brainstorming sessions; avoiding doing so can lead in the mid/long term to an irreversible condition known as "overcommitment" of the digital transformation approach.
There’s a design recommendation that usually remains unforeseen, or at least, not explored as it should. This recommendation, more or less, can be summarized as in the sentence below, and is based on Microservices Premium ( Martin Fowler, 2015 ) design principle [1]:
Do not choose microservice approach as your first design option, except you have already gone through a process where you have exhausted all your well-known and available options – including a monolith – and assessed and discarded all your other alternatives, in term of cost, complexity, resources, project, change pace, infrastructure complexity, and other holistic areas.
An MSA approach has its own pros and cons, and incremental complexity is one of the reasons why the approach should be thoroughly understood before embarking on it, taking into account not only short-term goals but also those targeted at the mid and long-term. As we will see in the below paragraphs, complexity plays a vital role when evaluating the digital transformation scope ( area or areas of business we want to transform into microservices ) versus the committed capacity we have; as we will see, bad evaluation and planning of our existing capacity can lead to an overcommit condition that may put all or part of our initiative at risk.
If we compare MSA with monoliths, monoliths are usually more predictable and can deliver more value quickly ( quick-wins), with small-to-no effort initially, simply because of the “siloed” architecture where applications can not only be deployed on top of platforms or runtimes that are already provided – which will reduce our set-up time drastically – but also, there’s no need to partition an application into separate services; services can live together so we do not need to worry about distributed deployment as we would do with microservices. Developers can write their own functions and made them automatically available for everyone who is writing another code section, with little effort. The coarser-grain nature of monoliths also reduces the number of components deployed everywhere to its minimum expression, which reduces costs associated with the maintenance and operations of the infrastructure at a greater extent. If we sum up all of these benefits, the results is that the siloed approach associated with monoliths represents reduced complexity, and simpler execution.
Setting up a microservice environment, depending on the size of the initiative, usually involves high effort at the beginning, especially when it comes to setting up the infrastructure and cross-cutting services and capabilities. Besides the challenges associated with the technology, a typical microservice initiative consisting of just a few microservices – and honoring the two-slice pizza rule for people allocation – will demand two teams conformed by up to 4 or 8 members, depending on the functional size of the capabilities being implemented.
All these aspects can turn a microservice approach to be expensive at the beginning of the project, whereas the results obtained in the short term maybe not be so-promising, especially because much of the effort is targeted to set the “building-blocks” on top of which the target microservice ecosystem will be built upon.
We can define efficiency, in operational terms, as the quotient between the results obtained and the complexity of the implementation, representing here complexity the cardinality associated with the number of resources and the effort involved:
In a typical MSA, efficiency is initially low, because a lot of the effort is directed to set the “building blocks” and the core services required to build the “outer architecture” required to then build and deploy microservices later also called the “inner architecture” , but no real business capability gets implemented on this phase. ( Identified as “1” on the above diagram )
As the project moves forward, some existing business capabilities are being developed and deployed into the system, and once development teams start gaining traction on the initiative by deploying new business microservices, the efficiency starts showing signs of ramping up. ( Identified as “2” on the above diagram )
Recommended by LinkedIn
As complexity increases, the distributed nature of microservices is able to adapt itself to the increasing size of the solution, by means of increasing the number of microservices – and the number of teams involved --, and by leveraging the flexibility associated with the infrastructure and their components. Efficiency increases as complexity increases, to a point where ( Identified as “3” on the above diagram ) efficiency starts to drop, no matter how well we have implemented our solution; this point is known as the “maximum commitment threshold” or “hard-limit” for microservices. There could be multiple reasons for which such a point is reached, that can play as “backpressure” points, such as:
Such kind of performance limitations cannot be simply solved by scaling the number of resources, because the limitation is intrinsic to the software component design, which means the problem is organic, and not related to resource availability. In the best-case scenario, such situations can be solved by just deep-dive tuning the affected components, but in the worst-case scenario, nothing can be done in the short term, causing a limitation on the further introduction of new features from the business perspective.
Once the Overcommitment limit is reached, then productivity and efficiency remain to decrease until a certain point where it remains stable again ( identified as “4” on the above diagram ); if complexity keeps increasing after this point, then efficiency tends to decrease towards zero value, as performance turns out to be worst to a point where system usability gets compromised due to multiple latency issues and lack of responsiveness. It’s important to note that, in the case of a monolith approach, past the overcommit limit the efficiency tends rapidly to zero since complexity cannot be handled in a timely manner due to the lack of elasticity of the monolith approach. -- that’s where a MSA approach can still make a difference because microservices can still be handled independently.
We should plan our microservice strategy accordingly, by making sure all our business requirements and capabilities will be ready in place before we reach such an overcommitment point, or otherwise, our project may be at risk. Sometimes, it is better to reduce our scope rather than promise something that will exceed our capacity, and that’s something that needs to be considered even before the first microservice gets written and deployed.
Said that an exercise should be conducted by all the involved parties and stakeholders, in order to determine where the overcommitment point resides exactly, and once this has been agreed upon, we can then plan accordingly what our scope will be, where we want to be at a small, mid and long-term and what is the roadmap to reach those objectives.
Of course, such an exercise is not precisely simple, and should be part of an initial feasibility study that once completed, should be then made visible to everyone for consistency and uniqueness of vision. Sometimes, this kind of feasibility study is delegated to a task force constituted by delivery leads, architects business specialists, so that the convergence of the different visions can lead to a more concise and exact plan and roadmap for the upcoming months or years, in the form of an advisory or discovery process. The lack of such a plan and roadmap will for sure lead, unfailingly to a project risk due to potential overcommitment
SOS el mejor Willy! Cuánto me alegra haber trabajado con vos y haberte conocido!