IT Operations Analytics  (ITOA) - The data driven approach to IT Operations
www.eccecustomer.com

IT Operations Analytics (ITOA) - The data driven approach to IT Operations

As per Gartner's estimates, by 2017 around 15% of the enterprises will be using IT Operations Analytics (ITOA) on the huge volume of IT operational data to discover complex patterns and hence better insights, to reduce 'run-the business' costs or identify opportunities for 'grow-the-business' transformations. This, in effect, ushers a paradigm shift in IT Operations Management - from a current "tools driven" to a "data driven" approach. The current tools driven approach relies on silos of non integrated tools which fails to deliver the agility required to manage the ever growing complexities of hybrid cloud platforms and dynamic data centers.

Simply put, ITOA uses big data principles in extracting structured as well as unstructured data (Availability Data, Performance Data, Machine Data, Service Management Data) from diverse data sources (Application Logs, Monitoring Tools, Syslogs, APM Tools, Event Managers, ITSM Tools etc.), storing and indexing and finally correlating and analyzing them to get deeper business insights and proactive indicators about any potential or impending business impacting events.

Without getting too much into the technicalities let me summarize some of the best use cases that can potentially be addressed by ITOA :

1. Quickly find the root cause for a business outage : It is possible to crawl data from multiple sources and correlate them over a time-window at near-run time to get better visibility about the business impacting issues and quickly identify the layer attributable to the root cause. So whether the root cause is due to a recent change in a configuration item or a faulty load balancer or a highly utilized table space or high resource consumption by a rogue long running batch process becomes easier to identify. The traditional Event Managers lacks this powerful correlation features.

2. Topology based event visualization : Overlay the IT or Business Events on the topology map which does an application to infrastructure mapping, thereby providing powerful visualization about the business impacts and quickly identify a possible root cause. This improves operational visibility.

3. Identify repetitive patterns of issues : Identification and learning of patterns which leads to different kinds of business outages due to availability, performance or capacity issues. Subsequently these patterns can be used to predict an impending failure, before they actually happen. 

3. Dynamic Baselining : Using machine learning techniques, it is possible to baseline the behaviour or consumption patterns of the different IT resources like business workload, cpu, memory, io etc. This information can be utilized in resetting the alerting thresholds or rebalancing the compute capacities.

4. Identify unauthorized changes : Unauthorized changes in configuration items without a valid RFC can be tracked and identified.

5. Ensure environmental consistency & release validation : Configuration consistencies can be tracked real time between (i) IT Assets under a cluster/ application group (ii) IT assets between DC and DR for the same application group. Also with faster release cycles, it is necessary to validate the correctness of the release in an automated manner to prevent any business outage immediately after the release. This becomes extremely important especially as we get into a hybrid and dynamic landscape with faster deployment cycles and move closer to the DevOps paradigm.

6. Identify business events : Business Events can be defined along the nodes of a Business Service spanning multiple applications. Rule bases thresholds can then be applied to detect and determine outages/ breaches for those business events.

7. Identify automation opportunities : From Service Management (Incident & Service Request) data, it is possible to identify patterns of high volume & low severity tickets which needs a procedural steps for resolution. These are ideal candidates for either shift-left or automation. Eg. password resets, user on-boarding, various provisioning requests etc.

8. Track pockets of End of Life(EoL)/End of Support(EoS) devices and their performance & impacts : In every IT landscape there are pockets of EoL/EoS assets, whose availability and performance needs close monitoring.

9. Track Operational effectiveness : The various operational metrics related to the standard ITSM and Infrastructure Management processes and their SLA/OLA's can be tracked. This can help to identify weak spots in the process and opportunities for their improvements.

10. Identify opportunities for rationalization/ optimization/ consolidation and transformations : Valuable insights can be derived from piles of complex data which can help in identifying opportunities for rationalization/ optimization/ consolidation of IT assets as well as transformation initiatives.

 ***

Disclosure : The author is an employee of Tata Consultancy Services Ltd. The opinions expressed herein are my own and do not reflect those of the company.

Sarthak Da, This is a good, to the point overview of the ITOA. This domain is always about live data & getting the meaning full insight out of humongous data points in any business domain. Today organizations are feeling the waves of cloud & big data & every one has to sooner or later planning to shift to cloud.

Like
Reply

Very nice in summarized points..

Like
Reply

Great article...completely aligned with...

Like
Reply

Interesting and thought provoking read. This throws up many new possibilities in the world of Analytics and benefit the organisation(s) to pre-empt major failures in their Data centers.

Like
Reply

To view or add a comment, sign in

More articles by sarthak banerjee

  • AI driven personalization and the collateral damage

    I was going through a rather not-too-old Gartner's "Top Strategic Predictions for 2020 and beyond" and one of the…

    5 Comments
  • Career Shift - the post pandemic reality and the survival kit

    The Covid 19 pandemic has impacted lot of jobs across different sectors globally. A friend of mine, a senior marketing…

    12 Comments
  • Return of the composable enterprise

    Gartner, in its latest Press Release (August, 2020) has identified "Composite Architecture" as one of the five emerging…

    8 Comments
  • Beginning of the Gig outsourcing ?

    In 2016, Wipro acquired an Indianapolis based company called Appirio, an IT firm that helped customers migrate their…

    16 Comments
  • Is this the beginning of the end for the formal 4 year University Degree ?

    One of the sectors which has been hit the most, post the Covid19 pandemic is the education sector. According to a HBR…

    13 Comments
  • Top 10 IT risks that can threaten business

    As enterprises become more technology driven and rely on technology innovation to remain competitive, they are…

    6 Comments
  • 7 cardinal sins in performance testing

    Performance Testing of IT application and infrastructure is often neglected. I have seen instances where absolutely no…

    5 Comments
  • EMS @ Digital Enterprises

    In my last post Architecture@ Speed of thought - #futureproofing business, I wrote about the architectural…

  • Architecture @ speed of thought - #futureproofing business

    Recently I was going through a news article about a Chinese company who has been able to build a 57 story skyscraper in…

    2 Comments
  • Why not IT Operations Architecture?

    With more than 50% of the IT budgets allocated for maintaining & operating the current IT landscape (Keeping the Lights…

    12 Comments

Others also viewed

Explore content categories