Why I’m Right and the Software Industry is Wrong
I wrote this originally as an internal article, primarily to provoke a bit of thought and debate and partially for cathartic reasons. I was pleasantly surprised by the responses (well reasoned arguments both for and against), so thought it deserved a wider audience. Opinions expressed are solely my own blah blah...
(Slight) apologies for the clickbait title, but an emotive subject deserves an emotive title.
This blog post focuses on code comments (or rather the lack of them), a somewhat dry topic that for some reason drives strong opinions and can be like a red flag to the proverbial bull.
Over the last couple of weeks, I’ve been asked to help review source code written by a third party for acceptance into support/maintenance by one of our teams. As has happened a couple of times before in my career this is a completely cold handover (no interaction between authoring team and supporting teams), and when one opens up Pandora’s box of source code it’s a complete nightmare. There are no comments or documentation anywhere. No inline comments. No Javadoc comments. Nothing at all. Not a single bloody thing.
Note that For the purposes of this post I’m not going to make much of a differentiation between inline comments and documentation comments because for sensibly short methods/classes there’s often not a huge amount to distinguish between them. I have no problem with well-Javadoc’d code (or equivalent for your language choice) with nothing further inline.
The Conversation
When I’ve been running a development team and this has come up on my watch, the conversation usually goes something like:
Me:
“Code review failed, you’ve not written any comments.”
Developer:
"Yeuch. Comments.”
“Nobody writes those. If your code is self-descriptive you don’t need comments.”
“Comments just get out of date. They’re confusing and less helpful than if they’re not there”
Me:
<Scowls>
Developer:
<Sulks>
The Benefits
Clearly, I’m a proponent of augmenting the source code with additional information that may not be self-evident. It’s an unfashionable view, but after being round the block a few times with the nightmare of cold-handover-of-uncommented-code I’m convinced I’m right. Certainly, I believe that if you aim to write quality code that endures for a long period of time, you should be thinking about the people who pick it up years after you’ve moved on to other things.
So why am I right?
Here’s the case I put to people in the scenario above (I do try to avoid the whole scowl / sulk thing if possible):
- Your code is not as good as you think it is. Ouch. This applies to everyone. We have off-days, we get tired, we work to deadlines, we context-switch, we make mistakes. Even the best developers sometimes slip up. “I don’t write comments because code should be self-documenting” is right up there with “I don’t write tests because code should be bug free” and “I don’t do peer reviews because code should be right first time”. “Should be” and “is” are not the same.
- Help the reader. As a consumer of you code, I can probably tell what it does if it has been reasonably well written. What I can’t tell is why it does something that way or how I should use it. Monoliths of imperative code are being broken. Indirection, decoupling and asynchronous code have become increasingly commonplace, so even telling what something does can sometimes involve cross-referencing a whole bunch of independent chunks. Rather than putting all of this parsing effort onto the maintainer, a couple of well-placed signpost comments can greatly reduce the cognitive load on the reader. I think this is particularly important with recent trends towards greater terseness in languages.
- Comments improve design. When you comment you take pause from the stream-of-consciousness of coding, you stop to think about what your intent is, and you write this intent down. From my experience uncommented code is much more likely to break the single-responsibility-principle than code that incorporates comments. Such comments can also act as a helpful reminder to maintainers of the code as to what that single responsibility is/was, so as to not bend it too far.
- Cognitive bias of the author. No matter how hard you try, when writing code to be self-descriptive you are subconsciously fusing what you’ve written with the wealth of information you keep in your head. It is self-descriptive to you. Peer reviews don’t really help pick this up, as often the reviewer has similar internalised knowledge. Comments can be a useful mechanism for externalising some of the information that is in the collective consciousness of the team. This isn’t just a coding thing, there are plenty of examples of people going snow-blind with regular English-language documents where the thing they think they’ve written isn’t the thing they’ve actually written.
- Staying true to the original intent. Code changes. You may refactor it. Other team members may work on it after you. They may be less skilled or less experienced. By making a statement of intent for the code you give others a better chance of understanding that intent and of keeping with that intent. Critically, self-descriptive code only tells you what it does, not what it was intended to do. A discrepancy between a comment and the code can be a good signpost of a bug (not always, but often).
- Paradigms change. What is self-evident now based upon the latest language features and paradigms may not be quite so obvious once the technology-du-jour changes. Today’s commonplace might become tomorrow’s anti-pattern and evolutionary dead-end. (Singletons anyone?)
- Efficiency. I really don’t want to have to read every line of your code to understand what it does. Well-documented code provides an inbuilt human-readable encapsulation mechanism.
Counter-Arguments
One counter-argument I’ve faced that I don’t really buy is that comments can get out of step with code, which increases confusion and is less valuable than no comments at all. I’d argue that this is no more likely than method names, class names or variable names getting out of sync (i.e. the same self-descriptive aspects being used in lieu of comments). With a reasonable level of care and some basic peer-scrutiny this shouldn’t be the case, or at very least occur so infrequently as to not be material.
The other thing frequently pointed to me at as a counter-argument is where comments are tautological with the code. I’m not advocating anything like “Open the file”, “Increment the counter” “Set the value to foo” comments. What I’m looking for is supplementary insight into intent and usage. Helpful signposts for “how” and “why” rather than “what” are key to good commentary.
In the Wild
I had a bit of a look around some industrial-strength open-source repositories to try to understand why my views seemed so out-of-kilter with my perceptions of the industry. Taking a (very quick) poke at some meaty source code repositories in Java, Python and JavaScript, I observed two general approaches:
- Well-documented classes, methods, functions etc with occasional inline comments to provide additional explanation of the implementation
- Really well-documented classes, methods, functions, etc with no inline comments but implementation notes included at the API level
Well that’s interesting. Turns out that comprehensively-documented and insightfully commented code isn’t just the preserve of old fogies like myself. Possibly because the innate nature of large open source projects is that they need to be supportable and consumable by the broadest possible audience for them to thrive.
Maybe the Software Industry isn’t wrong.
Maybe I need to change the title of this post.
I’ve never met a piece of code that could tell me why it does what it does. Comments are not an option...
I fully agree with you, since I had recently exactly the same experience within a software audit :( At the end, and since it was a Saas , we agreed that the best documentation was a screen copy, which showed which information was necessary and what the presentation was supposed to be... quite unusual but adapted in this unique case. Using my own KM2 tool and original Python preprocessor, comments are first class citizens, and appears as a tool for providing in a clearer way the hierarchical decomposition of problems into smaller parts. Also pictures and decision tables are allowed which simplify greatly explanations and case analysis.
How about taking it up a level, me: "where are your design documents?", developer: "the code is the design", me:"looks like we will need some stories to refactor and remove some tech debt then!"
With you on this, for all the reasons cited. I don't know which aspect of the "industry" is to blame. Some may lay with TDD, since comments aren't necessary to pass tests. Flame on.