What’s wrong?

What’s wrong?

Fault Analysis is a key area I work with everyday, both as CTO in a company whose sole purpose is to help our customers identify potential problems in their IT, but also as a manager, since various challenges, problems or just plain bad luck among staff usually has a tendency to pop up.

So trying to understand what's wrong is a really fun proposition and it’s also interesting to see that most times, the cause isn’t what you first believed it to be. You're usually wrong and this is true both for technical problems as well as human ones.

More information = better answers?

Well, I would argue that this very much depends on a few critical points:

Q: Is the information structured in any way or is it reasonably possible to become structured?

In today's information overflow, regardless of the problem, structuring the information in one way or another is a huge and often cumbersome task. If you’re faced with a lot of information/data from many different sources in different formats, a big challenge (and time thief) will be to get your arms around the data.

Q: What is the source of the information/data?

Just like IRL, it's easy to mix data with gossip or even opinions. This can be very misleading. This is often an issue for IT operations, as most users of IT systems have a lot of opinions, which can be good in a “feedback” way, but could be bad if it eats up a lot of time arguing back and forth over opinions.  In the monitoring space, we are fortunate to have pretty comprehensive access to identify the sources of the data, however most systems or applications are configured by humans, and we can all have a bad day, can possibly end up changing things while unaware of how they’re integrated with other systems, etc. resulting in "bad data" being sent upstream. 

(However, a properly functional monitoring system can be a great tool to debunk “opinions”, since one of the core features provides performance numbers over time!)

Q: Can you visualize distributed and mixed data sources data/information in a simple and easily understandable way?

This a fast-growing challenge today as IT continues to play a natural and massive role in every section of an enterprise. Managers and staff responsible for P&L are actively engaged in improving processes, services or general customer experience in their departments (this applies to both internal and external services - we want to be better/improve, it’s just human nature - look at the explosion of DevOps). 

In parallel, we have the “Hybrid IT Explosion”: in short, IT-specific services and products are currently procured in lots of different places, by different people or companies - and this all joins together in a service or as part of a service delivery.

So, we have multiple people with different roles all wanting to get better understanding of “the problem” who then use the huge amount of data which has accumulated in the monitoring system.

There is no one answer to this challenge. There are however some basic requirements on the Monitoring side, such as:

  • Ability to take in multiple sources i.e. agent based, agentless,WMI, syslogs, API, Traps etc. and transform the incoming formats to a generic, and thus easily comparable format
  • Ability to use the above information to create your dashboards, reports, trend analysis, alarm triggers etc. regardless of their origin
  • Feed all this out via the REST API, so that you can use 3rd party software to further enhance your analysis or visualisation.
  • Have a good relationship with your users, i.e. get the feedback on how your standard visualisation can be improved - at all times

To complicate this challenge, many modern Monitoring systems today have really good UX but as the system grows, more devices and more applications are added, you will run into “the scalability challenge” and that will definitely affect where and how you can visualize your information/data.

Of course, there are many more talking points and concerns in the everlasting challenge to answer the “What is wrong?” question, please feel free to add a comment, as there is no right answer to what’s wrong:)

Have problem-free day!

/ Jan

To view or add a comment, sign in

More articles by Jan Josephson

Others also viewed

Explore content categories