Back to the basics
Based on an interesting discussion, I had over the last days.
Getting your infrastructure wild grow under control, by going back to the basics, before you build a complex CMS. To build your solid foundation, the following blocks are needed (I simplify in this post, a lot):
Discovery – comes in many shapes and forms, but the focus here should be on simplicity, rather then something complex.
Must have: ICMP/ping; SNMP; WMI (for Windows); SSH (for *nix)
Must deliver: IP, Hostname, Serial, OS, installed Products (for License Management)
Should have: return components (might be a challenge for network devices)
If you are adventures, a simple Python script can deliver all of the above and your discovery needs would be covert.
There are products out there, that are able to build even complex relationships and discover the same – considering the time required for amending and configuring your infrastructure and the product itself, to deliver – technical and business services can be build manually in a more efficient manner.
Monitoring Software – this is, where you do should put in some money, as it is the centre piece to get the infrastructure under control.
Must have: Up/Down of device, HDD quota
-If you are not in a singular infrastructure, it must also provide Up/Down of services.
Should have: possible to collect network / traffic data, monitor databases
- Anything else, that might be useful for your infrastructure, but might not necessary raise an alert.
Event Management – the main bridge, between your monitoring and the service desk.
There might be plenty of alerts collected, from your monitoring software, but your event bridge must be smart enough, to only let the Up/Down and HDD reaches quota pass; anything else, can stay in your monitoring software.
With the pillars build (I'm sure, a lot will completely disagree with this very simple build suggested here), we can start to build the foundations of our CMS, applying analytics.
Analytics will provide governance over our monitoring, using the discovery data and in combination with configuration rules, even compliance; and as the discovery data are also used to build the CIs in the CMS, the event data (alerts) are compliant with the service desk data.
Discovery and Monitoring will most likely provide a lot more data, that can be used in analytics to support problem management (restructuring the alerts, combined with the discovery data), capacity and availability management and this would only be the tip of the ice berg.
Now, that we have the basic pillars in place and build our foundation, we can start to expand the discovery and monitoring capabilities and at the same time, get a better grip on the infrastructure.
“In a chronically leaking boat, energy devoted to changing vessels is more productive than energy devoted to patching leaks.” -Warren Buffett