Overcoming the Dangers of Unit Testing Legacy Code
One of the least glamorous aspects of software engineering is the maintenance of legacy code. It involves reading some else's work, it's usually only bug fixes and in the end it's just not sexy working with technologies which are <gasp> three years old.
We've all gone in to maintain that old application written by "that guy", who left to join an Eco-startup in Brazil. It was a rushed job, he knew he was leaving and his heart wasn't into it. So the code is now a mess of half-baked requirements, hacks labeled "// don't touch this line, or it don't work", singletons and static calls for everything and for some reason, no Unit Testing at all.
A well tested project is like being a Jedi knight. Testing gives you a sixth sense, the ability to feel a tremor in the Force when you made a change and you've broken something else in your application. Broken production leads to anger and we all know where that road goes to.
Rarely do we have the time to stop everything and apply testing to an existing application. Together with my team at Behalf.com I've devised a few rules of thumb for engaging untested code or creating tests for code that we introduce into legacy applications. My team and I work on PHP and Javascript projects, so YMMV for other platforms.
The 1st rule of legacy testing: Rule of Accumulated Baseline. Our approach forgoes absolute coverage from the get-go. Don't test code you haven't changed or added to. Instead focus on the code that will actually change. We assume the existing code to behave in a "correct" or at least "acceptable" manner - one which we do not intend to prove as a baseline. Instead, as we slowly add tests, we will establish a new baseline of our own for the future.
The 2nd rule of legacy testing: Rule of Isolation. When adding new classes avoid the urge to conform to the existing shortcuts of static function calls and other anti-patterns. Instead, be exceptionally rigid in isolating your new logic class with dependency injections. Strive for maintainability and testability in your new logic. Take this opportunity to "do it right".
The 3rd rule of legacy testing: Rule of Static Insulation. If you find yourself unable to avoid using untestable code (a static call for example) wrap the this code in a Callable (Lambda or Anonymous function, functor, callback ...). Then provide for your callable by Lazy-Loading it into a class member and using it from there. This way, you can provide a way to preset the member with something else - like a testing mock, etc.
The 4th rule of legacy testing: Rule of Factory Override. Classes that have their initialisation built-in are generally unwieldy in testing. The "Built-in Factory" anti-pattern requires much environment setup and leaves no room for mocking. When modifying such a class we found it is ok to override them in a test-double. Create a test-version of said class by extending it and overriding the constructor or the initiator method. Then proceed to test normally.
Wait, what?!
Rule #4 sounds like a horrible aberration. No self-respecting unit test can allow itself to modify code it tests. Remember our disclaimer though: we are not testing the existing code, only changes to it. We must assume the initialisation code has been tested on production long enough so that any issue was already solved or mitigated. Unless you are now willing to take this class and its dependencies apart, in many cases there is no easy way to introduce proper testing so a compromise must be made.
The 5th rule of legacy testing: Rule of the Delegator. When adding new code or breaking up complex legacy code, encapsulate your code and changes into a Delegate. A delegator class is a class which assigns an intimate task to another, Helper class. This is a very fancy way of saying that all of the code you want tested is moved into another class. The Delegator (the original legacy) class internally instantiates, initialises and then calls the Helper class methods.
In rule #5 avoid the urge to just dump the whole legacy functionality into a Helper class. All that old extra code is not going to help your case. Pick your battles, choose those pieces of logic that you are actually changing and will need testing. Remember the Rule of Accumulated Baseline: test only code you actually changed.
Legacy code takes creativity and some bendy thinking to prosper. It takes patience and an ability to accept the code for what it is - an application that is valuable in its current form. Recall that someone just like you wrote this code, sinking his hopes and aspirations into the logic so that it will work well. Like all of us, he is only human.
The code you write right now will soon be Legacy code too. Be sure to leave behind something that you are proud of.
Great post!
Great Post, I recently added to TFS the comparison of code coverage between builds and making valid builds if percentage increased or remained the same. not allowing code coverage to deteriorate, I will share your Post with our developers.