Testing Tricks When The Testers Are All Gone: How About Simple Oracles?
Testing skills to the rescue today.
Another developer on the team wanted to use an (publicly available in .Net) object that is supposed to be thread safe, but in a way that is frowned upon. It looked like it SHOULD have worked because of the object involved, even though the code was violating principles of thread safety. What he wasn't sure of was how to know it if was safe or not.
So I coached him on testing using simple oracles, showed him how to increase the probability of failure with a very easy validation mechanism. This wasn't exactly testing 101. This was more like testing 285. The problem under test was more advanced, but the technique once understood was easy. Five minutes later he has proof that what he wanted to do was going to break, and he decides to go with the less performant, but necessary implementation.
Simple Oracles:
I got this idea from Harry Robinson, a long-time tester at Microsoft (currently at https://www.garudax.id/in/harry-robinson-a705391/ ). The problem is that some coding problems are difficult to implement, and sometimes the relationship of inputs to results is very complex and large. The test code is then burdened with having to create a set of tests sufficient to find bugs in this code while avoid the same sort of complex tangles and messes as the code itself. The trick is to find cases where verification is easier than implementation.
Harry's simple oracle example: Square Root
Implementing square root isn't the most straightforward problem. Philosophers and mathematicians spent hundreds of years figuring out the fastest and most accurate way to derive square roots. The input domain is infinite, the result set is infinite and precision requirements subjective and complicated (round it to how many decimal places?). The temptation is to choose one of two poor testing approaches:
- implement a parallel solution in test and compare actual solution results to test solution results
- create a catalog of known inputs and outputs
The first solution runs the risk of the test code imitating the bugs of the product code. The second solution doesn't scale well over the size of the problem set.
But there is a much simpler way to test. Take advantage of the mathematical principal that "SquareRoot(X) * SquareRoot(X) = X" Write test code that iterates over any set of values for X (random, minimum to maximum at highest precision, whatever) and confirm that for every input X, multiplying "SquareRoot(X)*SquareRoot(X)==X" There are exceptions (negative numbers, etc.) - but those exceptions can easily be dealt with once the simple test code is written. This allows for coverage of a massive input domain very quickly with simple test code.
The trick in this case is to take advantage of conditions where if you know the result first you can easily construct ways that ought to get to it. Rather than taking the preliminary steps and anticipating the result - which is much more difficult - go the other way and build the steps from the result, often easy.
Back to my story - The Developer's Problem:
He was trying to maintain a list of items that had a key (some string, like "smith" or "jones", something used to look things up) and a list of values (such as "A,B,C" or "B,E,H,I"). The code would look up the current value in the key and add something to the list. The concern was regarding what would happen if two different pieces of code were to add different values to the item at the same key at the same time. For example, if "jones" was currently set to ("A") and two different pieces of code were adding "B" and "C" to the list and happened to do so at the same time. Would that generate the expected result (which should be either "A,B,C" or "A,C,B" - order does not matter as long as all are there) or would it drop (e.g. either "A,B" or "A,C")?
My suggestion to test the code:
- generate expected result sets of random values of length N (e.g. ("A","B","C","H","W"...)
- create a validation routine that compare one set against another to ensure all items in set1 are also in set2
- create something that, given a set, would call the method to add with as much concurrency as possible AND with a constrained set of keys (forcing probability of collision)
- pass the resulting set from the prior step along with the input set to the validation routine
- do this over and over with relatively large values of N until you find a case where the comparison says the input and result sets are not equal
The code to do this is short and easy to write, and it sounds obvious when someone describes it. The secret to it is starting with the expected results and working backward from there. That wasn't immediately apparent.
And the results were cool. The test code ran about a dozen times, with values of N around 500 or so - and it was somewhere around the 10th or 11th iteration of the test that it found a failure. So while he found the failure pretty fast, this wasn't a problem that would have absolutely manifest unless the condition was constrained as per above. Meaning in production it would have been a nasty mystery bug that would have occurred intermittently, would have been hard to realize it had even happened, and would have taken a long time to figure out.
I place this at "Testing 285" because despite the simplicity of the technique, the problem did require some understanding of concurrency and the relationship of inputs to hitting a bug.
And... why?
For those reading this far, I am thinking in public on this topic because it relates to thoughts I am still working on regarding the role of testing in a combined engineering environment. This is an example of a problem that is very well suited for a developer to tackle, but whose analysis is coming very much from the "how do I make it fail easily and with higher probability?" discipline of test engineering. The principle was easy to teach, and the developer demonstrated the skill immediately. It pulls into question the true magnitude of the principle "a developer cannot test their own code." I believe there is truth to that, but I also believe the true extent of that truth has been exaggerated over the years.