Hi
I have been noticing gaps again in our graphs. We have been fairly active lately adding and removing devices, not a huge amount, about 6 a day are being removed and 6 new devices being added. So we haven't been increasing the amount of elements in the database.
The total number of elements now is 8369 on our single polling engine
I have checked the database and looked at the polling engine which has a completion percentage of 99.07%
Our collection statistics are as follows:
Polling interval for NODES, INTERFACES & VOLUMES are all at 180 sec
Statistic collection for
nodes = 15min
interfaces = 10 min
vol = 15 min
I had checked the database entries for some of the interfaces who's graphs had gaps at a 15 min interval in the web GUI. The interface traffic detail confirms the polls were missed. Also it seems that it keeps happening mostly to a same interfaces. I thought the polling cycle was random so to prevent this?
Also on the SQL database server I have noticed that the AVG Disk Queue is almost constantly at 100. With the highest number of I/O reads & writes coming from the SQL Process. (not surprising really) I'm fairly certain that a major contributing factor to this is Syslog messagees.
NOTE:
I increased the statistics collection interval on the interfaces to 15min for a about 4 hours yeaterday to see if there would be any improvement. The polling completion percent actually dropped to 98.746% so I have returned it back to the 10 min mark but the % has stayed in the 98% range.
Looking at the interface traffic detail a few days previous (when the Polling completeion % was at 99.07) and seeing a number interfaces that were barely polling once an HOUR!! leads me to doubt the acurracy of the polling completion % calculation or it's source.
I know I have viewed the missed polls in the database and I know that the avg dsk queue length will not affect the polling engine but is it possible that the avg disk queue length is creating a bottle neck that isn't allowing polling stats to be added to the database causing both
gaps in graphs
&
the gaps in the interface traffic detail query
Any suggestions