Risk Explained
for people who build and run things
If you operate or manage technical systems, then you already deal with changes. To keep systems running or improve performance, you carry out different kinds of tasks. Some are routine maintenance. Others involve changes like upgrades, modifications, repairs, or replacements.
Every time you make a change, there’s a chance that something won’t go as planned — a setting might be missed, a part might not fit, or an unexpected interaction might cause trouble.
This article is for people working in these environments — engineers, technicians, or managers — who are responsible for or directly involved in that kind of technical work.
1. What is Risk?
Before we can handle risks effectively, we need a clear understanding of what risk is. The international standard ISO 31000 defines risk as “the effect of uncertainty on objectives.” It’s a concise definition — but still abstract in practice. A more hands-on version comes from Dr. David Hillson:
uncertainty that matters
In other words, for something to be a risk, it must be uncertain — and it must matter to us. That’s why all risks involve uncertainty, but not all uncertainties are risks. The distinction is important because it helps us separate risks from their likely causes.
Example: We're planning to apply the latest Security Patch to all Windows Server 2025. Some servers have pending disk errors. These disk errors are not risks — they are known issues, or certainties. But they can be likely causes of a real risk that the servers may fail to restart cleanly after patching. That’s the risk — an uncertainty that matters, because it could affect uptime or performance. And there could be other likely causes for that.
So, is it important to know the difference between a risk and a likely cause? Absolutely — if you want to work systematically with risk and avoid serious trouble. Confusing the two can lead you to miss real uncertainties that can hurt you, or waste time fixing things that aren’t actually risks at all.
2. Why is risk important?
Changes to technical systems don’t always go as planned — and when they don’t, the impact can be serious. Client data shows that around 70% of all incidents are caused by our own changes. According to Gartner, that number rises to 85% for performance-related incidents. This means that identifying and reducing risks before making a change is one of the most effective ways to avoid disruption.
If outcomes like system performance, availability, security, or functionality are important, we must actively think ahead about what might go wrong.
Because risks haven’t happened yet, we have the chance to act before small uncertainties turn into big problems. By identifying and dealing with uncertainties early, we give ourselves the best possible chance to maintain performance, avoid disruptions, and achieve our goals.
Recommended by LinkedIn
Without structured risk thinking, you're leaving success to chance — and that’s a gamble with the company’s money.
3. How do we manage risks?
Managing risk doesn’t need to be complicated. At its core, it’s a simple and logical way of thinking. It builds on a few powerful questions that help us stay in control and avoid surprises.
We begin by asking: What are we trying to achieve? This defines the task or change we are going to make, for example, opening a firewall port while preserving performance and security.
Then we ask: Does this task need a closer look? Not everything does. If it’s routine, well understood, and low impact, we can often just go ahead and execute. But if there’s any real potential for uncertainty or consequences, we ask the next questions.
That brings us to: What might affect us? This helps us identify the uncertainties — the things that could influence how the task plays out.
Next, we ask: Which are the big ones? That’s how we identify the risks — the uncertainties that could affect something important.
Finally, we move to action: What can we do about it? and What do we do about it? These last questions help us reduce uncertainty, avoid trouble, and give the task the best possible chance of success.
By asking and answering these questions systematically, we stay proactive and protect the availability, performance, and reliability of our systems and services.
This is also where approaches start to differ depending on your domain. Please refer to Operational Risk Handling - good practice for avoiding technical problems, a method that is aimed at managers, engineers, and technicians working in technical operations, where decisions are made in real-time and under real constraints. It builds on practical experience as well as the risk theory introduced earlier — and brings it together in a structured, field-ready method.
Acknowledgement
This document draws on insights and best practices from several respected sources, including: