The Dev...Ops Paradox
A paradox is a statement that, despite apparently valid reasoning from true premises, leads to a seemingly self-contradictory or a logically unacceptable conclusion - Wikipedia
Most traditional enterprise IT organisation have their development (level 3) and operations (level 2) departments separate. This has been the best laid plan for many organisations for decades, and for the most part has been a tried and tested approach.
This article is a simple observation of the paradox of this approach in relation to the desire to prioritise speed to value and reliability of a system.
Motivation
Every Ops team needs certain information to do their job well. They need to know when there is a failure and where that failure is originating from, for example; is it a failure within the system or with an external dependency? Their job is to keep the system operating and if there is a failure, get the mean time to resolution (MTTR) as low as possible. The Ops team's motivation is clear... Reliability.
Every Dev team goal is clear as well. Build features, deliver value, fast, then repeat! They are not typically rewarded for reliability of their system they are building. Reasons for this can range from the fact they are working on a system that hasn't reached production yet or for a production system the manager in charge of development (new features) is not the same manager in charge of operations (reliability). The Dev team's motivation is clear... More value.
Ability
We have established what the Ops team needs to do their job well but who is best placed to do it? Who is best placed to define failure modes and monitoring, to create alerts and dashboards? It's typically the Dev team, not the Ops team;
- Dev team already have access to and knowledge of the code. It's quicker for them to add logging, error handling, alerting than someone else. Oh, and not to mention they are already in the code writing features!
- Dev team built the system. They know all the details of the business logic, acceptance criteria and what the failure modes are.
The Paradox
So by keeping the Dev team and the Ops team separate we have created;
- One team that is motivated to "fix the things" and reduce MTTR but without context and ability to build the tools to achieve this goal... Ops.
- One team that has the context and ability to build the tools to reduce MTTR and in turn enable a more reliable system but without the motivation to do so... Dev.
Therefore, the traditional Dev level 3 and Ops level 2 teams valid reasoning from a once true premises, has lead to a seemingly self-contradictory conclusion.
Conclusion
Want to stop the cycle and increase reliability of your system therefore giving you confidence to deliver value faster? Try giving your Ops responsibilities to your Dev team and see what amazing improvements they come up with!
Need some more convincing on how a DevOps mindset improves speed and reliability?
Amazingly written Luke. I saw the summary is more to Dev doing more Ops works. But would there be an option to consider the Ops also partake in any of the Dev works, cause these 2 realms traditionally as we know have always been separate from either direction. Thanks in advance