Black Box testing and Unit Testing
If you ask any developer, most of them will tell you that unit tests are important to keep the quality of the product and how great TDD is when we use them. They'll tell you how quickly they can know if they broke something in the code.
On the other edge we have the Black Box tests, which are essentially end-to-end tests for the public components, both in the back and in the front ends, but not including the UI components (in the front end, think about the view model tests, not the buttons) .
In this article, whenever I mention "E2E tests", I mean black box tests, without involving the UI controls, and those E2E tests should be separate for FE and BE. Testing the UI controls and layout is not in the scope of this article and has its own importance.
I'm for one find unit tests less useful, and think that end-to-end tests are the way to go.
Now before you come after me with your torches and pitchforks, let me clarify. I know it's a provocative sentence.
Now that I got your attention, let's go deeper and understand what I'm trying to say.
What is the "unit" in unit tests that we're testing? That's a pretty vague term. It's common to treat a class as that unit. Some will tell you that the unit tests are for the lowest classes in the hierarchy, and tests for higher classes are integration tests. Some will tell you that integration tests are completely different things. Ask 5 people and you'll get 7 answers.
Essentially, we can say that we choose everything to be that unit and treat it as a black box - something close that we can inspect only from its externals, not its internals.
So let's suppose that any class in the hierarchy can be considered as our unit. In that case, I'm totally pro unit tests! but only for the highest classes in the hierarchy.
When I say "highest classes", I mean classes that we treat as public, like our services, and components that we ship and distribute to others. These components can be a zip library that we sell, an image converter that we develop and share with other teams to use, and so on. Let's call them Public Boxes.
The classes, or "units" that are exposed externally to the world should absolutely have all of their methods tests, but I think it's less preferred for the internal classes that compose them. Having unit tests on them can be problematic.
Here are several of my arguments:
The behavior is what matters
Basically, as long as you can ensure that the public methods of your public boxes do what they suppose to do and behave as expected, that means that your system (which is composed from the "public box" services) behave as expected, and you're good to go.
This approach is called Behavior-Driven Development, and it puts the focus on the system keep on behaving properly, and not particular methods on some internal classes.
The book "Growing Object-Oriented Software Guided by Tests" is talking about "internal quality" vs. "external quality". External quality is what naturally comes to our mind - expected behavior and stability. Internal quality is anything that's related to the code being decoupled, easy to maintain, having low dependencies and so on.
External quality can be achieved through E2E tests, because that's what tests the behavior of the system and ensures its stability (remember that E2E tests here mean Black-box tests: testing the APIs of the services, not the UI).
Internal quality can be achieved through unit tests, because having unit tests stresses the code to be decoupled, have less dependencies, and other goodies.
The book has a nice graph to depict this:
Both internal and external qualities are extremely important! But, I believe that internal quality can be achieved by other means, like a good design (+review) and best practices.
So why not using both Unit and E2E?
First of all, our resources are limited. Every test we create of one kind means we don't create a test of the other kind.
I see 2 main problems with unit tests:
- It takes a lot to maintain them. I believe that the internal structure of the system should be fluid and be changed together with the system. A healthy architecture should be flexible - letting its internal classes be merged, split, reform, created and deleted without any problems. This means that the unit tests will take a lot of maintenance efforts to keep up with the classes they test and rapidly change. And it makes me go back to a previous point in this article: if the system is still behaving as expected, that's what we need. Why a refactoring operation should make us rewrite any tests?
- The second problem is related to the first one from the other side. If refactoring the code and having a fluid and flexible architecture means a lot of maintenance to the tests, what will eventually probably happen is that developers will be reluctant to change existing internal classes, which will prevent a healthy growing of the architecture. They'll feel sorry to lose tests for existing methods, even if we don't need the methods anymore.
By the way, in many of the cases the behavior tests will be similar to the unit tests (read the 7th paragraph on how the unit we test can be the public service). For example, if I have an API that receives a string and a language name, and translates the string to the specified language, I may have a behavioral test that sends "Hello" and "Hebrew" and makes sure that "שלום" is being returned, or sends "Hello" and "abc" and make sure an error occurs of unidentified language. In these cases these are the behaviors we expect - similar to unit tests. The main thing is that we test only the public components.
For the final part I also want to say that after having 97% E2E tests in our code, we ran code coverage tool and the coverage was almost 70%, which is a lot! So if you define your E2E tests properly, your coverage should be fine.
There are, however, some points that should be said.
Behavior tests usually require experience to write properly. Also, they take Much more time to run, because you basically load your system. There are ways to work around this, like running only the affected tests (how can you be sure which tests weren't affected by the changes?) and using mocks to prevent IO operations. E2E tests are not perfect, but I believe that their value is higher than unit tests for internal components.
Basically you are saying that instead of the classic test pyramid you use the ice-cream cone (https://alisterbscott.com/kb/testing-pyramids/#Software_Testing_Ice-Cream_Cone) :-) As a sidenote when you use BDD you aren't necessarily writing E2E tests. Just as you can test "GIVEN the File menu WHEN the Open item is clicked THEN the Select file dialog box is launched", or "GIVEN an opened Select file dialog WHEN a file is chosen THEN the file is displayed on screen" which are actions done by the customer you can also use BDD to code the internal behaviour considering the different levels of your API. In other words, you can totally have "GIVEN a loaded file WHEN the header is invalid THEN an exception is thrown" or "GIVEN an old version of the file WHEN loaded THEN the file version is updated" which aren't strictly behaviours asked by the customer or E2E tests. Personally when I build libraries used by other developers within the company it's my duty to get sure I fulfill their requirements but also to have a defined behavior for everything else. If they ask me for a method that divides two numbers I must also define what happens when they send a null divisor even though their requirement was dividing two non-zero values.