Think like a Python, not a Tiger
Strangle Your Enemy?
Before we think about software, let’s take a moment to think about how lethal adversaries work in nature. In the animal kingdom, few predators use strangulation to kill their prey. That’s because in mammals, oxygen is carried in the blood, and when air supply is cut off, the prey can remain alive for perhaps 7-10 minutes using the oxygen that’s still in their blood. If at any point during that time they manage to gasp for air, the victim’s life may be saved.
Predators generally prefer to use a faster, messier method of attack that results in the loss of blood from their prey, which turns out to be a more effective approach, because if key arteries are slashed, a subject can bleed out in under two minutes. Perhaps that’s why so many top predators evolved to have sharp claws and teeth.
Yes, there are a few exceptions. Constrictor snakes such as the Boa and the Python use the slower technique. Some predators do clamp their jaws on a victim's neck to subdue them under the cover of darkness at night. For example, goats are killed this way by their predators. Large predators such as Lions, Tigers, Bears, Sharks, Orcas, and numerous smaller predators all rely on teeth and claws for hunting and self defense.
Modernizing software is a far cry from hunting, yet perhaps our natural tendencies draw from our experience as a predatory species. Might we prefer to spill blood than to strangle a life threatening adversary? I think so. I’m strongly attracted to the idea of eliminating problematic software systems by starting over fresh. I think it’s way easier to write new software rather than trying to first figure out what someone else was thinking when they wrote something long ago. I view this "tiger" thought as similar in nature to our hunting preferences as a predatory species. However, experience has taught me differently. Although I would rather just declare “tech bankruptcy” on an old system, and replace it with a shiny new one, it turns out that there’s a better way. It’s known by software architects as the Strangler Pattern.
As a reality check with my social media followers, I ran a survey asking what approach has worked for them for replacing an old but critically important system. According to the survey, 50% of respondents prefer the Strangler Pattern, as it has worked well for them. I was a bit surprised by this. I thought that the majority would actually be more pronounced. A whopping 27% of respondents prefer starting over fresh. Perhaps those are the ones who still think the way I do naturally. They think like tigers.
The Strangler Pattern is a way of modernizing a software system by gradually replacing it part by part until the original system is completely gone. Using it involves placing a facade in front of the user interface that allows you to peel away parts of the old system incrementally. For example, if a system implements an API, you might replace it noun by noun, and/or verb by verb. The old system is replaced one component at a time. If the system does not have an API, you first adjust the client to access the existing system through an API, introduce the API facade, and then begin replacing aspects of the API incrementally. Sometimes writing a new client is necessary in order to introduce that API.
Typically the first things that are replaced in a Strangler Pattern approach are user authorization, and preference settings. These don't require much knowledge about the system's features, and can give teams time to fully research the feature list, and find the right chunks for subsequent replacement increments.
Why is this pattern so popular, and why am I recommending it? The answer is really all about risk. Any time you make changes to a system, you assume a risk that your changes may not work as intended. If you introduce change in big chunks (like the start-over-fresh approach), then you also assume risk in a big chunk. By comparison, if you introduce change in small increments, you only assume exposure to risk in small increments. You’re still exposed to the same amount of total risk over the life of the project, but the potential impact of each released change is diminished, so your total risk exposure is dramatically reduced. So, if you’re responsible for the reliability of the critical system you are replacing, gradual replacement is definitely the way to go.
When you use Strangler Pattern, you don’t need to fully understand the internal workings of the entire legacy system. Admittedly you still do need to know all the features of the software so you can reproduce them as gradual replacements. What you don’t necessarily need is a person who knows, or documentation that explains, the implementation of the features. Perhaps you don’t have the source code for the system anymore, or you don’t employ anyone with the necessary skills to adapt it. Maybe you have the source code, but you don’t have a way to build working binaries for it anymore, or it only runs on equipment that’s past end-of-support and won’t meet your regulatory constraints.
Maybe your views aligned with the 19% of respondents who prefer to refactor your old system. Sometimes this approach makes sense. Maybe you still have all the source code, and still employ people that understand it, and the software is written in a language you can support over the long term, and you have a secure environment to run it in. Maybe the system is not terribly complex, and refactoring can be done relatively quickly or easily. These are all good reasons to refactor in place. It’s possible to conduct your refactoring efforts incrementally and gradually. Refactored systems are normally implemented in the same language they are originally written in, and remain supported by the same people who wrote it to begin with. I argue that even in these cases, Strangler Pattern still offers additional benefits and should probably still be preferred.
Why? Technically speaking, Strangler Pattern is a form of refactoring. It takes an API centric approach to the refactoring. Using this approach results in an API for your service, which is a resource that will make it easier to integrate with other systems, which will help you to be more creative, and more agile. Those who add APIs to their services find it much easier to collaborate with partners, as that API may be a clean and convenient point of integration between complimentary services.
My favorite reason why you should prefer the Strangler Pattern is that it lends itself to adoption of a microservice system architecture. Each incremental improvement may be introduced by a new microservice. The benefits of this system architecture are widely recognized. They range from efficient scaling, improved resiliency and reliability, simplified maintenance, and added business agility due to simplification of software releases. Monolithic legacy systems are typically hard to upgrade without taking them offline, whereas microservice architecture systems are typically upgraded without interrupting service to clients.
Next time you are faced with modernizing a critically important software service, try to think more like a Python, and less like a Tiger. Consider using Strangler pattern and gradually replace that system over time rather than replacing it wholesale or refactoring it in place.
I have applied both the strategies. It all boils down to the risk and the appetite for customers
If your only option to modernize a system is manually rewriting it, then this type of divide and conquer strategy certainly reduces risk. However, as Ashwin Krishnan mentions, this strategy will extend the life of the original system and double the maintenance cost from a platform perspective. If you have better tools to execute the modernization with a high level of automation you should go for the Tiger approach. You are starting with a running system, i.e.: the "perfect specification"! If your initial goal is to change the technology stack and maintain the functionality of the system then an automatic migration that generates native code on the target platform is the best solution. It will allow you to turn off the original technology stack, move your dev team to the newer technology and gain all of the advantages of being on a modern platform. Then you can continue the normal evolution and eventual refactoring of your system. The approach i am suggesting works very well when you are running mission critical system on unsupported, very expensive or insecure platforms. On the other hand if you have the luxury of time then the Python approach is certainly an option.
Still bear the scars from Python2 to Python3 migration. Would rather think like a Tiger any time at this point :)
I once had a Research colleague tell me that even though version 1 of the software was "crap" (highly buggy), they could completely re-write version 2 from scratch and it would be "far better". While I suspect it would have been better, I'm glad the decision to preserve pieces and only only re-write a subset was finally accepted. The Strangler approach was successful. I can't say that if we'd spilled blood it would have failed, but it would have been messier. Great analogy and lesson. Thanks for sharing the insight, Adrian!
Would be way easier with https://code.store give it a try