SNMP data collection modes within the Network Management Software LoriotPro

SNMP data collection modes within the Network Management Software LoriotPro

Introduction 

With most of the SNMP Network Manager software like LoriotPro there are many ways to carry out status and performance indicator SNMP collection on monitored devices. This article explores these different ways, their advantages and disadvantages. LoriotPro is primarily an SNMP MANAGER in the strict sense of the SNMP standard, but it is also able to perform collections of any type using other protocols, databases, files, WEB access, XML, etc. SNMP collections are mainly covered in this document, but other types of data can be collected using the same modes.

The choice of an SNMP collection mode (method) is directly related to the following questions:

  1. How quickly should network administrators be informed of a malfunction? (responsiveness)
  2. How many devices have to be monitored and what are their access times? (milliseconds or seconds)
  3. How many performance indicators or statuses have to be collected on the devices? It is likely that not all monitored indicators have the same criticality.

Based on this information it will be easier to determine:

  1. If the mode of data collection must be passive or active or a combination of both.
  2. The period of the data collections in active mode.

Passive mode versus active mode

In passive mode, the SNMP manager (LoriotPro) waits to receive SNMP packets informing it of a change of status or an anomaly. These packets in the SNMP standard are called Trap or Notification. On receipt, the packets are analyzed and, in the event of a response with values judged outside the thresholds, the alarms are generated and the monitoring dashboards updated.

In active mode it is the SNMP manager (LoriotPro) who regularly polls the equipment, in case of non-response or response with values judged out of the thresholds, the alarms are generated and the monitoring dashboards updated.

Advantages of the passive mode:

  • Economical in bandwidth. If there are no anomalies on the equipment, there is no flow generated.
  • Alarms and anomalies are immediately reported to the manager and administrators

Drawbacks of the passive mode:

  • If there is packet loss by the network or simply loss of the connection link, the SNMP manager is not informed and anomalies can go unnoticed. SNMP trap packet use UDP protocol which is more likely to be dropped by congested routers. This drawback disappears when the device supports SNMP Inform and retransmission in case of non-acknowledgments. This risk is also reduced with LoriotPro because the regular PING polling makes it possible to have an alarm in case of loss of connection with an IP device.
  • Some devices do not have an SNMP agent capable of sending Trap or the available Trap does not allow having a relevant and exhaustive monitoring.
  • Only statuses can be monitored, flow and trend values cannot be captured or difficult to get except for RMON capable devices.
  • Since the source of the packets is multiple, access security is more difficult to manage and complicates the firewall rules.
  • The formats of the TRAP are very disparate and variant which makes complex the installation of filters and difficult to treat for the SNMP manager
  • Filters in place can only be tested by simulation (available with LoriotPro) or in rare cases where equipment can generate them on demand

Benefits of active mode:

  • Far less sensitive to packet loss and especially alarms are possible in case of loss of connection with the monitored device.
  • Collection of any type of variable at regular intervals
  • Collection frequency adjustable according to the criticality of the device or indicator being monitored.
  • Repetition of the collection in case of failure (retry)
  • Adjustable abort time in case of non-response (Timeout)
  • Richness of performance indicators (MIB Object) available

Drawbacks of active mode:

  • It generates more traffic on the network
  • It may overly stress the monitored device.

After choosing a mode of collection, active or passive, the volume of collections, the frequency of collections and the expected reactivity will guide the choice of collection tools.

Here are the different collection tools that this document explores in the following chapters

  • Collection of SNMP Trap and Notification (passive mode, very responsive close to real time and for large volumes)
  • Collection with asynchronous Poller (active mode, very reactive close to real time and for large volumes)
  •  Collection with Bulk Threshold Control plugins (active mode, low responsiveness and unpredictable update delay for an average collected data volume)
  •  Collection with ActiveView dashboards (active mode, low responsiveness and unpredictable update delay for an average collected data volume)
  •  Collecting with Audits (active mode, very responsive close to real time and for large volumes)
  • Collecting with Audits and Global Objects (active mode, very responsive close to real time and for large volumes)
  • Scheduled Collection in Task Scheduler (Active Mode)

 Collection with SNMP Trap & Notificaction (mode passif)

Principle

LoriotPro collects the Trap or SNMP notification sent by the devices into a receiving tray. SNMP Traps are displayed in the order they arrive. The filter configuration for each type of trap makes it possible to count these. Counters can then be used to change the status of graphic objects in ActiveView dashboards. Filters can also be used to send alarms to administrators (sound, mail, SMS, etc.)

 Advantages

  • Simplicity of concept, easy to understand
  • Responsiveness of the solution, administrators are notified almost in real time.

Drawbacks

  • It is a passive mode of collection (see chapter introduction).
  • All devices must be configured to send their Trap to the SNMP manager.
  • Potential lack of SNMP agents to send traps or lack of information in their content (poor design). The quality and relevance of supervision depend on the richness of the SNMP agents in the equipment.
  • For each Trap and each device, it is necessary to set up a separate filter which can lead to important configuration times.
  • Trap filter processing is sequential, the delay between reception and filter action cannot be predicted.

Collection with asynchronous Poller (mode actif)

Principle

LoriotPro has a process in the background (Poller Process) in charge of checking at regular intervals the connections with the IP devices declared in its directory. Depending on the responses returned by the devices the color statuses change, green indicates that the device responds to SNMP requests, blue that it responds to PING, yellow and red that the connection is lost. This process is asynchronous, and has two separate threads (threads), one for sending requests the other to parse all responses. Additional collections of SNMP data on various objects and specific to each monitored device can also be performed during polling, the return values will be recorded in files for exploitation in other modules of the software.

Advantages

  • It is an active mode of collection
  • The sending of requests is not dependent on the response time of the network and the device. A very large number of devices can be polled in a very short time.
  • The SCI files describing the SNMP objects to collect facilitate the configuration of similar devices, ie having the same SNMP objects with the same indexes.
  • Batch configuration possible thanks to a dedicated module (Bulk Configuration)
  • In the case of non-response of one or more devices with heavy timeouts, this does not affect the performance of the collection process.

Drawbacks

  • The operation of SNMP collections requires specific developments. Reading and analysis of csv files generated.

Collection with plugin Bulk Threshold Control (mode actif)

Principle

This module (PLUGIN) makes it possible to synchronously collect, one after another, SNMP objects on one or more devices. The collected values are then compared to thresholds to trigger alarms (called EVENT in the LoriotPro software). The Event is then used to alert an administrator of the anomaly by various means, dashboard, sound, email, SMS, etc.

The threshold comparison can be performed on SNMP objects of type integer, gauge and counter and string.

Advantages

  • Simple implementation and configuration via the graphical interface
  • Each device in the directory can be associated with the module (plugin). He is then in charge of the collections which are his own. This module has a dedicated thread (thread)
  • Template files describing the SNMP objects to collect, the thresholds make it easier to configure similar devices, ie having the same SNMP objects and having the same indexes.

Drawbacks

  • The time required to collect SNMP objects is unpredictable. The declared list of SNMP objects to collect is scanned sequentially. The total time required to collect them is the sum of the collection times of each object. The collection of an object is itself composed of the network delay (Round Trip Time - RTT) and the response time of the equipment.
  •  If a device in the list is not reachable, it delays the collection process by a delay equal to a maximum threshold of response time not to be exceeded (Timeout) of 2 seconds by default.
  • SNMP collection configuration is performed manually for each device and for each SNMP object through the Plugin GUI. Process that can be long and tedious if there is a large number of devices and a large number of objects.
  • The calculation of the thresholds to be implemented may require a preliminary analysis, especially in the case of counter-type SNMP objects. It is then necessary to take into account the period of collection to integrate a delta of value on these, which is the classic case of the calculation of the flow of the network interfaces, for example.

Collection with ActiveView dashboard (mode actif)

Principle

ActiveView monitoring dashboard are LoriotPro modules (plugin) that can be used for the display of network topology maps, operating availability status, front and rear device view, maps and layout, functional synoptic. ActiveViews are created manually and individually or from templates. They can also be dynamically created by scripts in LUA language

ActiveViews contain graphic objects whose visual appearance, mainly the background color, is dependent on the return value of an expression. This expression can be an SNMP collection, an error counter or alarm report, an LUA script, and so on.

ActiveViews collect values associated with each graphical object sequentially. The more objects in a visual, the more time it takes to collect and update objects. ActiveViews have two threads associated with two processing queues. By default all collections are placed in the first processing queue. If a device processed in the first queue does not respond to requests (timeout), it delays the collection process and is then moved to the second queue. He is put back into the first queue when he answers the queries again.

Advantages

  • Simple implementation and configuration via the graphical interface
  • The color of the graphic object is conditioned by a set of comparison rules on the value of the collected SNMP objects.
  • An alarm can be generated by these same comparison rules
  • Creating the content of an ActiveView visual can be done using the LoriotPro integrated LUA scripting language. This makes it possible to reproduce similar and repetitive visuals quickly during the configuration. It also allows in a changing environment to update the visuals very quickly. This possibility is limited to some editions of LoriotPro: Extended Edition & Broadcast Edition. This option requires programming skills and LUA language training.
  • With Loriotpro Broadcast Edition, the attributes of the graphic objects of the ActiveView can be linked to the Global Objects values (see next chapter). The Global Object database is updated by dedicated processes managed by a pool of thread (max 800 threads). This pool supports resource reservation logics to guarantee response times.

Drawbacks

  • The sequential processing does not allow to update the ActiveView within a controlled time frame. The refresh time is equal to the sum of the minimum collection times (RTT).
  • The configuration of each graphic object is done manually. Process that can be long and tedious if there are a large number of objects in the view and if there is a large number of view. This disadvantage is no longer to be considered if one uses the LUA scripts for the creation of the ActiveView as specified in the advantages. 

Collection with Audits (mode actif)

Principle

Audits are LoriotPro plugins that are attached to IP devices declared in the directory. They use LUA scripts to perform collections. The scripts make it possible to carry out SNMP collections, but also of other type by using other protocols, databases, files, WEB accesses, etc.

Each audit has a thread of its own to function. The numbers of thread that can run concurrently on a system depends on its CPU model and power, at the tops of that, Windows 64 bits operating system is able to manager hundreds of threads on a multi-core processor.  

But unlike other LoriotPro plugin modules, audit threads are not assigned statically and permanently, but dynamically from a pool (Thread_pool). At the launch of LoriotPro, the threads of the pool share the collection and processing (LUA script) of all the audit modules declared in the directory.

Advantages

  •  All collections can be carried out under controlled time constraints and this on thousands of devices. Every 15 seconds in Extended Edition, every second in Broadcast Edition
  • A collection function library is available (LUA script)

Drawbacks

  • Audits require programing skills and LUA training if the collections are not in the library.
  • Audits are available only in the Extended Edition & Broadcast Edition of LoriotPro
  • Audit must create events or files to for being exploited by other modules like ActiveView

Collection with Audits & Global Object (mode actif)

Principle

The principle is the same as that presented in the previous chapter, but it adds a feature available in the Broadcast Edition of LoriotPro. This feature called Global Object allows you to store the values of the collected SNMP objects and make them accessible from the other LoriotPro modules.


 

As a reminder, the data collection on the device to be monitored is mainly performed with the SNMP protocol. As we mention earlier, this protocol is used to retrieve status and performance indicators through SNMP agents on devices and systems. The response times of agents to queries are quite unpredictable, so we cannot really predict the time that a collection process LoriotPro will need to get a response. Each collection process has a maximum Timeout beyond which it considers that the agent does not respond. If a single process is responsible for these collections that it performs so sequentially the performance may not be at the rendezvous knowing that a hundred collections can take a few seconds to minutes.

Let's summarize the context: We know that the collections we call "tasks" have very random execution times ranging from a few milliseconds to several seconds. In addition, we wish to carry out collections periodically and at tight intervals (polling period), of the order of one second for certain performance indicators.

Principle implemented: All the tasks (collections) to be carried out are grouped together in a single program. These details for each task the type of collection to be performed (SNMP GET on a MIB object). For example we use SNMP objects, but other types of collection can be done, extract log files, query on SQL databases, read TRAP counters, etc. It should be noted that the collections can come from global variables already in memory, which makes it possible to process by correlation.

To perform these tasks, a variable number of processes can be attached to it. In principle, the greater the number of tasks to be performed is and higher retry rate is, the greater the number of processes is required. The processes in question are instantiated as much as necessary (LoriotPro's Plugin Audit Process) to assume an almost parallel treatment.

To simplify the configuration, an Audit process (902) is provided. One or more Audits can be responsible for processing global objects that are defined in the same group.


Here is a simplified example with two processes in charge of three collections. The collections are carried out at different time intervals, the collection times are also assumed variable. Both processes (process audit) support collections based on their availability. As soon as they have finished their job, they go through the list of collections to be made and claim the first one for which the polling period has expired and which is not already allocated.


This example has ratios between polling period and disproportionate processing time. Usually the ratio between this is from about 1 to about 100. In the case where an SNMP agent is working properly, a collection is of the order of a few tens of milliseconds and the interrogation intervals between 1 and 15 seconds. Delays in treatment may occur if collection times increase or the number of collections is increased.

Ideally, the sum of the ratios between the execution time and the polling period of all the collections should be less than the number of collection processes available for their processing.

The values of the collections are then stored in a block of global object directly in memory. These objects are accessible from everywhere within LoriotPro and more particularly within an Active View visual.

Advantages

  • Collection of thousand SNMP objects within strict time conditions
  • There is a separation of the collection processes and the display processes which allows to achieve exceptional performances on both the volumes collected and the display speeds.
  • Collection processes are distinct and bring performance with very dynamic ActiveView visualization interfaces close to real time.
  • Each audit (script) can be dedicated to monitoring a type and model of device and reused multiple times.

Drawbacks

  • Audits require programing skills and LUA language training.
  • Require a system that support multithreading

Scheduled collection in task scheduler

From version 8 of LoriotPro, a task scheduler is available. It comes in the form of a calendar-style graphical interface in which it is possible to program collection tasks. The collections can be planned thus days or even months in advance and can be recurring daily, weekly, monthly, etc. Collections based on LUA scripts can be simple and involve a few SNMP or complex objects to integrate many collections on many devices and / or correlation.

Advantages

  •  Ideal for data collections for inventory or report generation

Drawbacks

  • Totally unsuitable for device monitoring with small polling intervals.

Summary

The choice of a monitoring data collection mode of an IP equipment infrastructure is dictated by the technical constraints. Two main ones will be retained: the volume of data to be collected and the frequency of these collections. The LoriotPro software has several collection modes to adapt to all these constraints even strong ones.

To view or add a comment, sign in

More articles by Florent Brisson

Others also viewed

Explore content categories