Connecting linear algebra, rounding errors, structural analysis, and the legacy of numerical computing at Arup c. 1968
A recent question from a colleague about a common problem encountered while solving a system of linear equations sent me diving deep into historic publications in the area and in particular, led me to discover how advanced the thinking at Arup was way back in 1968.
The finite element method is one of biggest success stories in the field of scientific computing - it has been used probably in the design of every single skyscraper, bridge, or aircraft built in the last half century. A key computational step in the modelling process is solving a large system of linear equations, and it's here that the problem I mentioned above arises.
Finite element models, usually created in environments like Oasys GSA, are complicated beasts with datasets involving millions or billions of variables in the equations. Occasionally, inadvertent errors in creating the models lead to potentially erroneous answers when solving their equations. The most overt of these problems are singularities and its insidious cousin ill-conditioning.
Singularity and ill-conditioning are both intimately related to rounding errors in computers. A singularity makes it impossible to solve a system of linear equations. Ill-conditioning is subtler; a matrix is ill-conditioned if a small change to it makes it singular. Arup's academic collaborator Professor Nick Higham, FRS has an excellent blog on this topic that I'll link to below.
The relationship between modelling errors and singularities is well-understood amongst practitioners today. For instance one reason systems become singular is when parts of the finite element model are not stiff, i.e., they form a mechanism. But, like many researchers, occasionally I also like to play historian and my goal therefore was to travel through time and discover how much we knew about all this and for how long. And that's when I discovered this gem in an old Arup publication by John Blanchard all the way back in 1968. I've linked the article below - the quote appears on p9.
To explain this insight, I'll have to tell you about how computers work with real numbers.
Instead of the infinite precision needed for storing real numbers, digital computers can only work with finite precision using the floating point system. The solution procedure, which is based on a beautiful algorithm called Cholesky factorisation, introduces zeros into the matrix and forms it into a shape that makes it easier to solve the equations. This technique of introducing zeros involves plenty of divisions by matrix elements that sit on the diagonals. And this is where the problem arises. Working with finite precision, if you're solving an ill-conditioned matrix, you could easily end up dividing by some very tiny numbers, which result in answers that are extremely large. This xkcd joke summarizes the situation succinctly.
For the end-user of the finite element software, these large answers would make it appear as though their structural element has been loaded by an infinite force producing infinite movement, and this is what Blanchard's quote is trying to convey.
Recommended by LinkedIn
To reason why Blanchard's insight is amazing for its time, we'd have to imagine what we knew about algorithms and computational mathematics back in 1968. James H Wilkinson FRS, mathematician and Turing award winner, had just finished laying the foundations for rounding error analysis but information didn't travel as fast then as it does now. William Kahan, another Turing awardee, hadn't yet standardized the IEEE floating point standard. Implementations were all over the place with rounding behaviour and accuracy varying across computer manufacturers. As such, much of the background knowledge that I've hand-waved about above simply didn't exist. John Blanchard appears to be a structural engineer, programmer and mathematician rolled into one. In piecing all this together, it appears to me that his genius was in making the connections between what he saw in the user's simulation model to what was going on at the deepest level in the machine hardware. Without much of the theoretical background that we now know to build on!
If that's not cool, I don't know what is!
If you've stumbled across an 'ahead of its time' idea in a similar vein, I'd love to hear.
Next time (part 2 of 2) - Blanchard on computational complexity, NP Hardness, graphs and closing the loop on modelling errors with machine learning.
If you’ve read this far and are motivated to work on highly challenging computational problems, check out these vacancies in my team:
Arup Journal article: https://www.arup.com/perspectives/publications/the-arup-journal/section/the-arup-journal-1968-issue-1
Blog by Professor Nick Higham: https://nhigham.com/2020/03/19/what-is-a-condition-number/
Great article
An interesting read for a wet Sunday afternoon… or indeed any other time
Great piece 💪🏻