Why complex software systems often fail?

Jignesh Solanki

Published May 2, 2017

Rise of safety critical systems from simple enterprise softwares

In just a few decades, we have seen many simple web apps for enterprises convert into business critical systems. Some of them quickly became safety critical systems. So, any software we build for enterprise today have prospect of evolving in some form of safety. Looking at the Internet of Things and Big data paradigm this becomes even more evident.

If you don’t have safety criticality in your business software now, you will have it in future.

I will quote Dietrich’s words from the Logic of failure here.

Complexity is the label we will give to the existence of many interdependent variables in a given system

Failures and complex software systems

These systems are hazardous by their own nature. Frequency of failures can be changed, but complex processes in an enterprise are the perpetrator for giving rise to new failures.

Complex systems are heavily and successfully defended against failures. To defend themselves enterprises have to surround them with backup and other safety systems.

Further defenses are built against human elements by training and education of enterprise users. To add another layer of security, various organizational, institutional and regulatory defenses are built up in form of policies, procedures, certifications and team training. A wall of defense is built up.

Often, these defenses are design based on single failures, which when considered individually are easy to safeguard against. The layers of fail-safe implementation works really well for many that have simple software requirements, and operations are easily carried out without any hassle.

Great complexity places high demands on a planner's capacity to gather information, integrate findings, and design effective actions

Single point of failure isn’t the big deal...

But in complex systems, safeguarding against single point failures is not enough. Often small independent failures join together in a complex environment, and create catastrophic system failures. And, this is what separates complex software systems from the rest, it is almost impossible have a complex software solution that run without multiple software flaws present.

The nature of these flaws change dynamically, primarily due to changing technology that is continuously being integrated, dynamically changing workplace and continuous efforts that are present to eradicate such failures.

These systems still work

Despite the flaws, these complex systems work, and people make it function properly – albeit in a broken manner. For those, who think that these flaws should be pre-identified and diagnosed, clearly has never been a part of building complex software processes. Software system operations and usage is becoming extremely dynamic, even the components at the basic level (organization, tech and human) failing and changing dynamically.

Root-cause analysis becomes fundamentally wrong. Mainly because it fails to isolate “cause” of an event, when there are multiple contributors. Also, ‘root cause’ analysis isn’t technically correct and discards understanding of the software system failure.

Complexity is not an objective factor but a subjective one. Super Signals reduce complexity, collapsing a number of features into one. Consequently, complexity must be understood in terms of a specific individual and his or her supply of super signals. We learn super signals from experience, and our supply can differ greatly from another individual's. Therefore there can be no objective measure of complexity

While this post talks about software system failures, I will soon cover building complex adaptive systems that could deal with multiple points of failures and dynamic changes in general.

Have any thoughts to share? Drop a comment!

To view or add a comment, sign in

Why complex software systems often fail?

Jignesh Solanki

More articles by Jignesh Solanki

Others also viewed

Build vs. Buy in the Age of Cyberwarfare

6 Lies You're Likely to Hear from Lebanese Software Companies

Location and Lifecycle in software systems Part 2 When in the lifecycle

Atlassian Applications: Security and Reliability

Settings Management for Docker Desktop now generally available in the Admin Console

Beta Systems Software Newsletter - January 2026 Edition

Open-Source Software Audit Essentials: What CEOs Need to Know to Protect Their Company

FDA’s final Computer Software Assurance (CSA) guidance: Validate what matters, at the level it deserves

Software Sovereignty Is a Vector, Not a Scalar

The Case for Software Transparency

Explore content categories

More articles by Jignesh Solanki

Why these organizations migrated to MongoDB?

“Works just fine” isn’t scalable

Enterprise Mobility focus shifts to ROI, but how would you measure it?

Serving an experience first generation

How Uber compresses 86% of incoming data to decrease required storage space by 30 times?

What you need to know about the OAuth breach?

Consumer-Focused-IoT Part 2: IoT and Consumer Privacy

Consumer Focused IoT: Context, Feedback and Brand Consumer Interactions

Could Tor be the future of IoT Security?