Implemented

SNMP Trap & Syslog Rules Overhaul

In my opinion these two items have been neglected by SolarWinds for many years.  We use SNMP trapping extensively within my organization and every rule we have to create is an arduous process.  Ideally there are several aspects of both of these functions that should be improved upon.

1.  Copy/Paste rule creation.  When we look at alerts we can take a similar alert and make a copy of it altering the rule to suit our needs.  This element doesn't exist in the SNMP or Sylsog rules.  Each rule must be built from scratch.  For example I have multiple rules that are exactly the same with the minor exception being one specific OID for Netscaler traps.  If the OID equals one of our web servers we send it to the web team...if it is one of our exchange servers we send to our messaging team, and so on.  However to build these rules we have to manually create.

2.  Import/Export actions.  In alerts you can import/export an action for use within another alert.  This functionality is missing from the Syslog/SNMP rules.

3.  Enhanced ordering.  At my last count I have 160 + SNMP rules.  These rules are top down ordered.  When I create a new rule it is placed at the bottom.  If I need that rule to go to the top I have to click my mouse 160 times to get it to the top (no wonder I've had carpal tunnel surgery on both hands).  A drag an drop feature would solve this issue.

My first three requests I would think should be relatively simple because these features exist today within other components of SW.  The 4th I assume would be a little trickier to accomplish.

4.  Treat SNMP/Syslog rules like alerts that must be acknowledged (if desired).  Right now if I get an SNMP trap that I would consider to be critical it sends an email. It is not treated like an alert that requires acknowledgement.  I understand this would be a much greater challenge because you would have to have well defined reset scenarios.

I know that I am in the minority in this as it seems that many other members of the community rely less dependently on traps but they are a part of our environment and they aren't going away.  As they continue to grow I will be forced to look for alternatives to SW in this space if SW doesn't evolve these areas.  I have been using SW for 6 years and I have seen little to no improvement in these two areas.  I had hoped with the acquisition of Kiwi there would have been some nice improvements but alas that isn't the case.

Parents
  • One additional comment on the whole syslogs and traps rules thing, in general and I guess this goes for Advanced alerts as well..

    I'd like to be able to truly import/export my alerts.  Specifically, I'd like to create an alert, export it, make some minor changes that would be of course, database driven, and then be able to bulk import the new pile of alerts with complete actions that are unfortunately very often manual per alert.  Things like trap handlers, where I create one that tracks down one specific trap, but I have a list of OID's, and a list of ideas of what I want to start with for alerting points in the case of Advanced alerts, in the case of traps, there might be one trap that I need to parse literally 200 times with a different rule just to get the behavior I need, based on what I can *actually* pick out about the trap details.

    Another thing is the handling of traps when they come in.. the only way I could get detailed trap data into the traps was to go out and translate on an external DB running on the SQL server, it's pretty janky, to be perfectly frank, and the data is in the source MIB, which happens to be a Turin-->Force10-->Dell product.  Other trap and syslog use-cases can be worked around, but in a truly inferior fashion.

    Probably the biggest part of the problem is really the limited parsing engine in the first place, and the fact that the level of sophistication that ANY part of NPM has about deltas over time, or quantity-based alerting, is pretty much non-existent.  It's the fault of the complexity of relationships between events, rules, and actions, simply doesn't allow alerts to be crafted that would cut a broader swath.  This makes extensive efforts at making them manageable at scale, of course, terrifically important, but it's really that SW can't quantify anything relating to "time or quantity over time" in the product, except for simplistic single oid statistical values.  Imagine, if I get 16 syslog messages, I may not care, but if I get 16K, I'm really going to care, even though I may have NO defined alerts of my own.  It's watching statistical deltas in events like that that is crucial to monitoring a network, and it's largely done by the seat of our pants, relying on SW heavily.

    Just to digress slightly more, deltas over time on basic port utilization.  If usage on the port is normally running 70/55Mb say, and it suddenly spikes to 90/90, that's a delta change we could have never predicted.  I did consider offloading this task to a perl framework that would go out and watch this stuff, and set a custom property value for base values, and then attempt to have an alert trigger off of a percentage increase in actual over base, but it's honestly a bit overly complicated, and should be something you guys should be doing for us.

    I'm kind of sick of the fact that screen-watching and stare and compare are pretty much the only option for catching many critical types of network events.  This product isn't supposed to be about bio-automation.

    I already have, and would most certainly continue digressing this particular feature request into a huge review of everything that it could touch, but I'm trying to exercise a modicum of restraint emoticons_wink.png

    HTH,


    Peter

  • Export to txt,csv from web instead of just pdf

  • Yeah, unfortunately that's a cross-product feature request.  Being able to send scheduled reports as CSV would be a huge plus, but I can solve it with cron and perl nicely.

    Peter

Comment Children
  • As I was looking through the open feature request ideas, I was all bent out of shape for a second because I didn't see it in the list of ideas open for voting.

    However, we are finally in a "What we're working on" state, which is absolutely amazing.  Big shout out to Mike Driskell for articulating this request so eloquently and everyone who contributed to making it happen inside and outside of SW.

    Really great to see.