Monitoring and Observability

Monitoring and Observability

This Post is an attempt to track the evolution of “Monitoring” to “Observability”. How and when did the Monitoring get buffed up with added features to become “Observability” and become part of the “DevOps” stack?

Unconventionally, TL;DR is mentioned at the bottom of this article.

Let's cut to the chase, I have segregated the post into below categories -

  • What is Monitoring
  • Origin of the term — “Observability”
  • What is Observability
  • TL;DR — the difference between Observability and Monitoring.


What is Monitoring

Monitoring was a traditional approach to monitor infrastructure and applications. This was an Operations approach, debugging was usually known and followed an up/down approach. ( you logged in to a system — checked certain logs, followed ITSM -etc.) .It was an approach where certain tools provided uni-dimensional information about infra/app failures. While the apps were hosted on-prem, they were mostly monolithic. DCs didn’t have scaling and the intersystem dependencies were easy to see and monitor. We knew the kind of events and failures. Alarms were blipping on dashboards and alerting was handled manually and got automated with time.


Origin of the term — “Observability”

“Twitter” mentioned this term for the first time in 2013. It is usually said the term has been borrowed from “Control theory”. Informally, it can be understood as — “Can you understand what is happening inside the system “

Evolution of Observability

As the cloud took center stage and the microservices, APIs became the building blocks of the software stack — the primitive monitoring approach proved to be inefficient. The new solutions were intricate, loosely coupled, and introduced many layers of co-relation. There can be so many things that could go wrong and debugging them would need real slice and dice capabilities and visualization capabilities. Log aggregation and then correlating them to analyze the problem would let engineers see — “what’s happening inside the system”

Hence, Observability became the approach to operating and debugging increasingly complex distributed systems.

It relies on below mentioned 4 pillars -

  • Monitoring
  • Alerting/visualization
  • Distributed systems tracing infrastructure
  • Log aggregation/analytics


TL;DR — the difference between Observability and Monitoring.

1: Observability”, according to this definition, is a superset of “monitoring”,
2: Observability has not replaced monitoring rather it’s an approach to provide better Operations and also supports DevOps Capabilities.
3: Monitoring is a tooling solution that lets Operators monitor the state of their system. Observability has gone a step ahead and provided rich visualization and flexible slice-n-dice capabilities to debug complex systems.
4:Observability is everyone’s responsibility while monitoring was the realm of “system engineers /dedicated monitoring teams.


I think Open Telemetry is the next in evolution

Would like to wait for your next edition on traces, metrics and APMs 😇

To view or add a comment, sign in

More articles by Vivek Kulshrestha

  • "What-Why-How" on SRE

    A lot of reading and analysis is underway to understand the nuisances of #SRE as we are trying to set up the practice…

    3 Comments
  • "Infrastructure Security" for financial Institutions.

    Financial institutions are probably the most dynamic technology consumers in today's world. Their global presence…

Others also viewed

Explore content categories