The High Cost of Ignoring Software Deployment Risks

Mark Strange

Published Jul 20, 2024

This week, the global realization of several risks in software development and deployment came into sharp focus. A recent update from CrowdStrike brought the #BSOD front and centre, sparking global media claims of an 'internet outage'.

This resonates with points I recently discussed in response to "IT Underperforms Again, Again" by Prof. Bent Flyvbjerg , where I noted: "...risk management is often ineffective, with senior stakeholders willing to embrace substantial risks. The real challenge arises when they are reluctant to accept the consequences if these risks materialise..."

In the development and release chain, numerous risks are often accepted, assuming a low probability of occurrence. We understand that monitoring and security products operate at critical system levels to prioritize access and protect the system. However, errors at this level can cause severe, unrecoverable failures—yet antivirus and security updates are often considered low-risk deployments when perhaps they are high risk, low probability.

For such a widespread problem to have occurred there must have been systematic failures in multiple places.

Crowdstrike, The release might not have undergone adequate testing: While it's difficult to know the exact level of testing without insider information, one would expect such a prevalent issue to have been identified before release.
Customers, Accepting a release from a vendor and pushing this directly to their estate: Although there is a general expectation that vendors release high-quality, thoroughly tested software, companies vary greatly in their release management strategies. Risk-averse organisations likely test and manage releases extensively, while others, more accepting of risk, might directly push updates, focusing only on testing new features.

Recommended by LinkedIn

Pre-Deployment Test: Not Required — The Three Most…

Christian Omeni 2 months ago

BigFix Automatic Patching - Do more with less work

Brad Sexton 6 years ago

From Deployment to Decommission: How PMaaS Powers the…

UDT 1 year ago

So how do we mitigate these risks? Vendors will often use an early release 'eat our own dog food' approach, releasing all software to their teams ahead of public release. It's better to cause an outage of your internal systems than wipe $16B off your market cap.

This approach really should be deployed throughout the entire development journey from beta to general availability. Microsoft 's ring deployment methodology suggests a phased rollout to mitigate such risks, a strategy used across Windows and Office ecosystems, allowing users to choose their release track. If a similar methodology had been implemented, the scale of the problem could have been significantly reduced, likely affecting only non-critical systems.

In cases where complex race conditions make problems hard to pinpoint, some issues can be excusable. However, when a significant number of Windows machines have critical failures due to a software update, it's a clear indication of poor risk management.

Photo by Loic Leray

David E. Loeliger 1y

A couple of thoughts: Having implemented various ERP systems including system and IT upgrades I learnt early in my career why there is a test system, a pre-production system (an exact copy of the production environment) and the production system itself. You do not want to have the bugs entering your production environment - rigorous testing of both the bright side (are things going the way they should) and the dark side (what might go wrong) has to be done in both, the test and the pre-production system including the consideration of all knock-on risk (Domino stones falling) before even thinking of entering the production environment. If you are still uncertain, take a pilot system, where any impact can be ring-fenced. Does this give absolute certainty? No it doesn’t, but it clearly increases your chances of success. Planning and testing is EVERYTHING. Finally it might have been that protocols where followed, but that those protocols weren‘t any longer sufficient or not covering the development and the current state of the system environment. Too many times I have seen outdated protocols applied. Risk management is a continuous process and one should never feel safe. Some positive paranoia and scepticism is always helpful.

9 Reactions

Morten Elvang 1y

Yes and no 😉 See here: https://www.garudax.id/posts/flyvbjerg_software-development-activity-7220379023149838337-pdIw?utm_source=share&utm_medium=member_ios

The High Cost of Ignoring Software Deployment Risks

Mark Strange

Recommended by LinkedIn

More articles by Mark Strange

Others also viewed

Building Software Resilience: A Comprehensive Approach

A 90-Day Sprint Towards Modernization

System hardening through configuration compliance

Enterprise Software troubleshooting - Narrowing down the human factor

Enhancing Managed Service Provider Services Through Software Engineering Partnerships

SysAdmin Diaries - Fleet policies for patch management

🗂️ The Power of Documentation: Why IT Professionals Must Prioritize It

Patch Management

Know Your Enemy

7 IT challenges sysadmins are facing in 2026 (and how to survive them)

Explore content categories

Recommended by LinkedIn

More articles by Mark Strange

Life as we know it is entering a different chapter — not just because of AI, but because of data.

You’re Only as Secure as Your Weakest Vendor

Hospital to Home: Bridging the Care Gap

Others also viewed

Building Software Resilience: A Comprehensive Approach

A 90-Day Sprint Towards Modernization

System hardening through configuration compliance

Enterprise Software troubleshooting - Narrowing down the human factor

Enhancing Managed Service Provider Services Through Software Engineering Partnerships

SysAdmin Diaries - Fleet policies for patch management

🗂️ The Power of Documentation: Why IT Professionals Must Prioritize It

Patch Management

Know Your Enemy

7 IT challenges sysadmins are facing in 2026 (and how to survive them)

Similar topics

Software Release Management Procedures

Explore content categories