Hi,
I am using solarwinds 9 ,I want to configure a mail alert for the devices which
are not responding for any SNMP request (but they are alive).
is it possible..?
As a temporary work around I have set a rule to check the nodes severity level.
A severity of 0 is good, anything else is an issue.
Trigger when all apply:Vendor is equal to WindowsNode Status is equal to UPTrigger when any apply:Severity is not equal to 0
I have an email that is sent to me to check SNMP
'param'
Great question!
This issue is constantly biting me and I need to find a solution for my own loss of SNMP - thank you for bringing this topic up.
I just found something in BASIC ALERTS which might be able to handle this and I'll be testing it today for myself and let you know how it goes.
I don't know if this statement is accurate but I seem to recall someone saying you can't have both BASIC and ADVANCED ALERTS working for Orion alerts. It's one or the other. I have never used BASIC alerts - I went straight to ADVANCED alerts for our needs. Does anybody know if this statement is correct about the use of alerts?
lchance,
I have a couple of basic alerts configured for a few "oddball" devices that gives up/down status that run just fine along with all of our advanced alerts.
Hope this helps!
What I set up in BASIC alerts is not working out for me.
I did bring up WIRESHARK and filtered on a node which I know I've had recently lost responses with SNMP polling.
What I'm seeing is ICMP - PORT UNREACHABLE (SNMP Port 161).
Does anyone know if Orion offers this anywhere in its monitoring, the capability to intercept these type ICMP packets? Or is that a Feature Request?
SW can correct me if I'm wrong, but I believe that if you set your trigger up to alert if a device is in 'Unknown' status, this will do the trick. If a device is unpingable, then it will show up as 'Down', but if it is pingable and cannot be polled for whatever reason, it goes into 'Unknown' status. At least, this is what I've seen.
This isn't quite how "unknown" works. I had to check with Dev myself to be sure. For a node, "Unknown" occurs after a has been re-managed (after being unmanaged) but not yet polled. For an interface, it may mean that the interface indexes have changed but we haven't found the new indexes yet, or the node may not be responding to SNMP.
For the original question, I think detecting that the device is up but not responding to SNMP could be done with APM, using a port monitor. I don't think it's possible within NPM.
I've not had this happen on routers or switches...only on servers. What I've done is setup a volume alert that says if a volume goes into an unknown state...let me know about it. This doesn't mean that SNMP stopped 100% of the time, but it is a clue as to when it does stop.
Any solutions.... ? does solarwinds look into this issue..
Did you try Denny's suggestion above?
I've tried it and it does not appear to be the solution.
It would also be nice to include the SNMP TimeStamp to the CPU Gauge. Funny because this system hasn't had SNMP running for over a week, but yet, CPU is running at 62%.
WOW!!!
95% of NPM's usability comes from the data collected via SNMP running on a node, but NPM can't tell you if SNMP data collection is failing for a node?
And to top that off, SolarWind's response is to purchase an APM license for every node for which NPM collects SNMP data?
I feel for those of you who do not have a need for, or cannot afford, APM. Especially those of you who are using NPM to monitor Windows or UNIX/Linux class systems, which have the most issues with SNMP freezing or dying.
well... you are somewhat right... I wouldnt sat it so bluntly but yea...
I wouldnt WANT to monitor SNMP with APM... I dont want to have any network nodes in my APM.
If SNMP stops working on a node that might not be a serious issue but if during that time an interface generates a lot of traffic and/or errors, or a volume on a server reaches 100% that is critical! and we would never be alarmed on it...
for me personally I will try out justty's approach and set that alert to only alert during business hours.
We're working on SNMP-based status polling in an enhanced version of the Orion poller that's in dev now. If you're interested in being part of the CTP, please send me an email.
Wow, seeing what param brought to our attention and what justty posted about the CPU Gauge really caused us some concern. We didn't realize this SNMP state wasn't able to be monitored / alerted effectively. We've now been going through all our nodes to confirm they are working correctly.
In addition to the ability of having NPM be intelligent enough to alert when SNMP is not responding on a node but ping is, I think the guages need to have display logic in them to refelect the no snmp data status as well (a negative 1 perhaps?).
Chris, add us to the poller CTP list, if you are still taking requests. We are running NPM 10 RC3.