ADVANCED ALERTING ENGINE ARCHITECTURE DESIGN

Version 12

    Having worked within the SolarWinds environment for eleven years I've always struggled with Alert management. It seemed like every time there needed to be a format change to an alert email or the addition of a custom property field to the alert logic I was spending two to four days editing the hundreds of alerts I was managing. Due to the identification of a system limitation I was forced to change the way we leveraged the SolarWinds alerting engine.

     

    This document outlines the configuration and design that we implemented to extend the SolarWinds NPM alerting engine to meet our needs. This implementation allowed us to reduce the number of enabled production alerts actively being managed by administrators from approximately 1000 alerts down to 27. The architecture outlined in this document and the document itself are under continued review and improvement.

     

    It is my hope that this document will assist other administrators in leveraging the SolarWinds NPM alerting engine in addition to saving them time and effort.

     

    Thank you.

     

    (Please note that this document is relevant to NPM version 11.0.1 or older)

     

    ((4/4/15: WPM transactions as group members has been delayed due to further testing))

     

    ((4/15/2015: BUG FIX - an error in the trigger coding has been corrected that prevented the first of two time spans entered into "z_n_exception". The alerts were only evaluating the second of the two time spans... and fixed a bug caused by the bug fix.))

     

    ((7/18/2017: Due to an employment change I have not been able to further develop or maintain this document since 5/16/2015. If I have the opportunity to work with this framework again development will continue.))