A network monitor uses many mechanisms to get data about the status of its devices and interconnected links. In some cases, the monitoring station and its agents collect data from devices. In others, the devices themselves report information to the station and agents. Our monitoring strategy should use the right means to get the information we need with the least impact to the network's performance. Let's have a look at the tools available to us.
We'll start with pull methods, where the network monitoring station and agents query devices for relevant information.
SNMP, the Simple Network Management Protocol has been at the core of query-based network monitoring since the early 1990s. It began with a simple method for accessing a standard body of information called a Management Information Base or MIB. Most equipment and software vendors have embraced and extended this body to varying degrees of consistency.
Further revisions (SNMPv2c, SNMPv2u and SNMPv3) came along in the 19d early 2000s. These respectively added some bulk information retrieval functionality and improved privacy and security.
SNMP information is usually polled from the monitoring station at five-minute intervals. This allows the station to compile trend data for key device metrics. Some of these metrics will be central to device operation: CPU use, free memory, uptime, &c. Others will deal with the connected links: bandwidth usage, link errors and status, queue overflows, &c.
We need to be careful when setting up SNMP queries. Many networking devices don't have a lot of spare processor cycles for handling SNMP queries, so we should minimize the frequency and volume of retrieved information. Otherwise, we risk impact to network performance just by our active monitoring.
SNMP is an older technology and the information that we can retrieve can be limited. When we need to get information that isn't available through a query, we need to resort to other options. Often, script access to the device's command-line interface (CLI) is the simplest method. Utilities like expect or scripting languages like python or go will allow information to be extracted by filtering CLI output to extract necessary data.
Like SNMP, we need to be careful about taxing the devices we're querying. CLI output is usually much more detailed than an SNMP query and requires more work on the part of the device to produce it.
Push methods are the second group of information gathering techniques. With these, the device is sending the information to the monitoring station or its agents without first being asked.
SNMP has a basic push model where the device sends urgent information to the monitoring station and/or agents as events occur. These SNMP traps cover most changes in most categories that we want to know about right away. For the most part, they trigger on fixed occurrences: interface up/down, routing protocol peer connection status, device reboot, &c.
RMON, Remote Network MONitoring was developed as an extension to SNMP. It puts more focus on the information flowing across the device than on the device itself and is most often used to define conditions under which an SNMP trap should be sent. Where SNMP will send a trap when a given event occurs, RMON can have more specific triggers based on more detailed information. If we're concerned about CPU spikes, for example, we can have RMON send a trap when CPU usage goes up too quickly.
Most devices will send a log stream to a remote server for archiving and analysis. By tuning what is sent at the device level, operational details can be relayed to the monitoring station in near real time. The trick is to keep this information filtered at the transmitting device so that only the relevant information is sent.
Some devices, particularly the Linux-based ones, can run scripts locally and send the output to the monitoring server via syslog or SNMP traps. Cisco devices, for example can use the Tool Command Language (TCL) or Embedded Event Manager (EEM) applets to provide this function.
Which technologies are you considering for your network monitoring strategies? Are you sticking with the old tried and true methods and nothing more, or are you giving some thought to something new and exciting?