This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

snmp traffic excessive?

Some of our Linux nodes (and maybe all of them) seem to generate tens to hundreds SNMP requests per second from Solarwinds - while being polled for the usual thinks like a fe volumes, CPU-memory, plus hardware health and one process monitor.

Is this normal? If so, how do I convince our network engineers this is not a threat? They're turning off snmp services on some of the critical nodes for fear "flooding our network with SNMP traffic".

Thanks!

Screen Shot 2016-09-08 at 1.15.05 PM.png

Sampling of entries from /var/log/messages/ logs:

Aug 14 03:45:35 LinuxNodeHP snmpd[6915]: Connection from UDP: [127.0.0.1]:38044->[127.0.0.1]

Aug 14 03:45:35 LinuxNodeHP snmpd[6915]: Connection from UDP: [127.0.0.1]:36210->[127.0.0.1]

Aug 14 03:45:35 LinuxNodeHP snmpd[6915]: Connection from UDP: [127.0.0.1]:39611->[127.0.0.1]

Aug 14 03:45:46 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:45:46 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:45:46 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:45:46 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:45:46 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:45:46 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:45:46 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:45:46 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:45:46 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:45:46 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:45:46 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:45:46 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:45:46 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:45:46 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:45:46 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:45:46 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:45:46 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:45:46 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:45:46 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:45:46 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:45:50 LinuxNodeHP snmpd[6915]: Connection from UDP: [127.0.0.1]:39377->[127.0.0.1]

Aug 14 03:45:50 LinuxNodeHP snmpd[6915]: Connection from UDP: [127.0.0.1]:39377->[127.0.0.1]

Aug 14 03:45:50 LinuxNodeHP snmpd[6915]: Connection from UDP: [127.0.0.1]:46720->[127.0.0.1]

Aug 14 03:45:50 LinuxNodeHP snmpd[6915]: Connection from UDP: [127.0.0.1]:52632->[127.0.0.1]

Aug 14 03:46:02 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:46:02 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:46:02 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:46:02 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:46:02 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:46:02 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:46:02 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:46:02 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:46:02 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:46:02 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:46:02 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

Aug 14 03:46:02 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192]

  • Alot of this data is retrieved by universal device poller and it looks to be configured incorrectly. Make sure the polling interval on the undp is set correctly. On tests i have done, i have set no less than 15 minutes as the anything less will cause strain on the orion server and cause hostnames to drop off in one case. The should alleviate any complaints from network admins as well and reduce the traffic on the network.

    pastedImage_0.png

  • Could you elaborate, what do you mean by "configured incorrectly"? Polling too frequently? Why is it "incorrect"?

    Sorry if I wasn't clear: no looking to decrease SNMP traffic until I know what is normal and what is a genuine threat. The question wasn't "how do I decrease SNMP traffic" but more of, "is this type of traffic normal for the type of polling I do", and "is this a genuine threat to network's performance and uptime"?

    Or, in other words, "how many 'snmpd / connection from UDP'" log entries are considered normal for an average stat collection by Solarwinds, for an average Linux node?"

    Notes

    • We do need to collect stats on our nodes very frequently;
    • polling intervals only affect ICMP traffic on nodes with ICMP type "uptime" polling if I am not mistaken
    • stats collection is what generates SNMP traffic, to my knowledge;

    Alot of this data is retrieved by universal device poller and it looks to be configured incorrectly.

  • Well, you need to check the pollers themselves to determine the polling interval. It looks configured incorrectly if it is hogging your network and looks to be polling too frequently. The undp pollers will poll and retrieve every 15 minutes but will retrieve the last 15 minutes data in its entity which will not miss anything. If you poll too frequently, you could also strain your linux hosts. You initially had a screenshot of the edit node screen in your question but you need to go to universal pollers to check the frequency of the pollers. Some information is retrieved by orion itself but that is every 10 minutes, so you could effectively have two snmp collections going on at frequent intervals which will cause issues like this.

  • It looks configured incorrectly if it is hogging your network and looks to be polling too frequently.

    What exactly is giving you the impression that the traffic is hogging our network? Did you run wireshark on it while I was asleep? emoticons_happy.png (Please read the original question. It looks to me you didn't.)

    Screen Shot 2016-09-08 at 3.15.32 PM.png

  • Yes i have read your question very clearly sir but you dont seem to understand the polling mechanisms on the undp and orion workig indepedently. If it is polling in seconds then it has not been configured correctly and i have explained to you how to do it the right way. snmp collection over seconds is not a good thing. If you want to concentrate on everything i say outside the answer i have given you, that is your decision. I have already explained to you polling by second is incorrect and how to fix it with a screenshot. I am extremely sorry i cant be of more assistance to you.

    Good day.

  • can you capture the data to see what is actually being polled?

    it might be that what you're seeing is that multiple values are being polled simultaneously i.e the cpu, memory, network interfaces, etc.

    e.g. a burst of queries every 120 seconds

    some of those queries are internal to the box, and finally, if you're logging every request to SNMP then that is generating more IO than the SNMP queries are...

    Unless you are on a very constrained network you can't possibly be flooding the network with SNMP queries, seriously your network engineers need to find something else to worry about.

    now many kbps are traffic are you talking here? The average size of a SNMP-get is <<100 bytes, so you're talking about less than 20*100*8 ~=-16Kbps

    I see that there is a 15-second gap between a burst of packets --

    where are you counting the tens and hundreds of thousands of packets? because they are not in the logs you posted.

    /RjL

  • Your statistics polling is causing the issue more than likely - set that to 3-5 minutes to see what that does.... but usually I see that set somewhere between 5-9 minutes for Stats polling.

  • it might be that what you're seeing is that multiple values are being polled simultaneously i.e the cpu, memory, network interfaces, etc.

    e.g. a burst of queries every 120 seconds

    some of those queries are internal to the box, and finally, if you're logging every request to SNMP then that is generating more IO than the SNMP queries are...

    Didn't have a chance to work on it today - busy day - may have to wait till Monday. Daily stats go like these: 300K SNMP-related log entries total, 280K of them - related to Solarwinds, and that comes out to 3+ entries per second - i.e. not a lot.

    If I can get Linux process monitor (it's CentOS 6.5) to display per-process totals for packets and bytes sent/received - then I'll have a better idea of the fraction of the total traffic SNMP traffic takes.

    The fact is, the Solarwinds server, collecting lots of stats on over 200 nodes single-handedly with 1m intervals - isn't choking on SNMP traffic (SQL server is another matter) - so a single node shouldn't either.

    And yes, decreasing SNMP log verbosity on these Linux nodes is very much due.

    Thanks for the response!

  • That was an incorrectly phrased question on my part. Turns out, the issue isn't with the traffic per se - but with logging UDP requests such as the ones in the original post.

    1. Aug 14 03:45:46 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192] 
    2. Aug 14 03:45:46 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192] 
    3. Aug 14 03:45:46 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192] 
    4. Aug 14 03:45:46 LinuxNodeHP snmpd[6915]: Connection from UDP: [10.11.12.218]:49919->[10.11.12.192] 

    To reduce and/or eliminate those, the following (or a similar) line should be present in the SNMP configuration file (usually /etc/snmp/snmpd.conf):

    dontLogTCPWrappersConnects yes

    The line is actually present in the default configuration file - we were overwriting it with our custom one that didn't have it. Lesson learned.

    Once we added the line, those UDP entries stopped flooding the logs.