Software Archeology

Peter Clark

Published Jan 10, 2021

My recent project has involved me taking a high profile trading application for which the authors had left the organisation with little more than the source code available. The task was simple enough, change the curve building methodology to support Alternative Reference Rate (ARR) discounting. However the complete lack of documentation meant that I had to deduce its functionality from running the application and clicking on buttons (and praying none of them impacted the trader's positions).

The code, written in C#, used a number of modern idioms, Lambda functions, LINQ, Dependency Injection and especially Multi Tasking. It was impressively fast but it made the common mistake of assuming that the behaviour of the code was self evident. Given a complete lack of code commentary, working out how it worked and what to change to implement the new functionality was tricky. I have effectively become a software archeologist where the motivation of the previous author had to be deduced from the merest hint from a variable name or a class interface definition.

Even with a competent debugger, identifying why a Lambda function failed when it implemented 4 or 5 layers of LINQ invocations and was running multiple parallel tasks was challenging for me. Maybe I am getting too old for this but I suspect there is no one else within the organisation who could take on this project.

Since I gave up assembler programming I developed 3 rules that I have found effective for modern programming languages. No class (file) should be more than 500 lines long, no method (function) should be more than 40 lines long and a third of a file should be devoted to comments or white space. I admit I often breach these guidelines but they send a warning every time I do, that I should consider refactoring the code.

So why do I demand so much commentary? Well however much documentation is developed in planning and designing a programme, the eventual functionality is defined by the code which may have strayed from the design as requirements change or shortcomings in the proposed design are identified. Rarely are design documents updated post delivery. So by associating commentary with the code there is at least some chance that they are aligned.

Commentary about what is being done such as adding a to b is not really helpful. What is required is the functional intention associated with a code fragment. This allows others to review what is being executed with the authors intention. A tricky bit of code may be intentionally rewriting the code itself but most people will assume overwriting code is a bug. If the intention was to do this, then say so. More generally allowing classes to define how the operate with others gives the software archeologist a pointer to where he should look next.

We all write perfect code that never goes wrong and is absolutely clear to everyone except the fool asking needless questions. However, I have, at least once, gone back to the version control system to find which idiot has hacked my code so badly and found I am the culprit. After 6 months, even I do not necessarily remember the motivation for some functionality I may have written.

So please can we include more rather than less comments in our code. It helps others to recognise your brilliance and may keep it operational for far longer than if the 'Not Invented Here' mob have its opacity as an excuse to kill it.

Glen Lalonde 5y

My first job was working on code that was more than 10 years old, convert a k & r C compiler to ANSI C. After that I realized just how important it was to write maintainable code.

1 Reaction

Tom Pickering 5y

This is very true, and whoever it is they are lucky to have you looking at their code. My rule is that I tell people "don't write code that only the smart people on the team can understand - because when they all leave for better jobs who will look after it?".

Software Archeology

Peter Clark

More articles by Peter Clark

Others also viewed

RIVER CROSSING RIDDLE

Python in civil engineering

(Ultimate Guide) Start Using Python to Automate PLAXIS

pythermalcomfort: A Python package for thermal comfort

Life lessons from a python sighting

Code in Python for Astrodynamics this summer!

Processing PEER NGA motion database using Python.

Pycon 2016 Australia

5 Reasons that makes you believe that Objects are living organism

Ancient Code: How Our Ancestors Already Knew Modern Programming

Explore content categories

More articles by Peter Clark

The Ages of Software

Market Inefficiencies and Bank Technology