In this next installment of introducing the new features of the upcoming Orion NPM 10.1 release, I wanted to cover Dependencies.  For those of you who have been using Orion NPM for some time, you know in order to setup dependencies for alert suppression you had to use the Advanced Alert Manager.  This approach had various issues and complications, which was one of the drivers behind us adding this feature.

 

For most folks, the primary use case for dependencies is alert suppression; however, but there are many more.  For example, you can group elements into logical grouping such as by location or service. 

 

Scenario:  
My Orion server resides in my Corporate Data Center in San Francisco, CA and polls sites located all around the world.  I want to be notified if any of my devices go down, so I have a general Orion alert setup to page me if a node goes down.  So what happens if the core routing device to one of those remote locations goes down?  If you do not have dependencies setup, then you will get a flood of pages/emails, one for each of those nodes in that site you are monitoring.  Not very helpful when you are trying to find the root cause!

 

If you had a Dependency setup, which established this parent-child relationship, then you would only get one alert for that single routing node at that location, pinpointing the problem immediately.

 

Let’s walk through setting this scenario up.

 
      
  1. Within the Setting page there is a new option in Node & Group Management called “Manage Dependencies”.    
    image      
  2.    
  3. Select “Add new dependency” and choose the parent element, which can be a node, interface or group.  I will cover Groups in more detail in another post, but in essence this feature allows you to statically or dynamically group elements together into a container and status rolls us based on the elements in that group.  From our example above, this parent node is the core router for one of my remote sites.    
    image
  4.    
  5. Select the children or downstream devices from the parent element, this can be a node or group.    
    image
  6.    
  7. That’s it. You have created your first dependency.  If that parent element now goes down, the child elements will now go into a new state “Unreachable”    
  8.    

    image

       

    From a UI stand point that is how you go about setting up a Dependency.  Let’s walk through some of the details under the covers on how things work.

       

    As I mentioned above, we're introducing a new status called "unreachable" in this version.   This means if you've configured your alerts when things go "down", they won't trigger when objects are set to the new "unreachable" status.

       

    There are two types of dependencies:   

       
          
    • Implicit dependency:  E.g. When server is "down", don't alert on applications on this node.  Mark them "unreachable" to avoid redundant alerts.  This happens automatically without any configuration or work on your part.
    •      
    • Explicit dependency:   E.g. When WAN link is "down", don't alert on all nodes behind this WAN link.   Mark them "unreachable" to avoid redundant alerts.  This happens by manually configuring a dependency relationship.
    •   
       

    More details on how this will work:

       
          
    • Implicit dependencies       
                
      • NPM (volumes, interfaces)           
                      
        • When setting volume status to unknown, Node status will be checked. If Node status is down or unreachable, volume status will be changed to "unreachable".
        •              
        • When setting interface status to unknown, Node status will be checked. If Node status is down or unreachable, interface status will be changed to "unreachable".
        •           
                
      •          
      • APM (applications, monitors)           
                      
        • When setting application/monitor to down or unknown, Node status will be checked. If Node status is down or unreachable, application and monitor status will be changed to "unreachable"
        •           
                
      •          
      • IPSLA (operations)           
                      
        • [need to wait for SP1 avail] When setting operation status to unknown, Node status will be checked. If Node status at either end of the operation is down or unreachable, operation status will be changed to "unreachable".  NOTE: We will not make any changes to operation status when setting to down since for IPSLA this means we CAN poll the results, but the operation itself could not collect data.
        •           
                
      •       
          
    •   
       
          
    • Explicit dependencies       
                
      • NPM (nodes)           
                      
        • When setting status on node to down or unknown, check for membership in a dependency relationship.  If the dependency indicates the app or monitor should be "unreachable", set the status to "unreachable".
        •           
                
      •          
      • APM (applications, monitors)           
                      
        • When setting status on application/monitor to down or unknown, check for membership in a dependency relationship (would be part of Group that is a child). If the dependency indicates the app or monitor should be ‘unreachable’, set the status to ‘unreachable’.
        •           
                
      •          
      • IPSLA (operations)           
                      
        • [need to wait for SP1 avail] When setting status on an operation status to unknown, check for membership in a dependency relationship (would be part of a Group that is a child). If the dependency indicates the operation should be unreachable, set the operation status to unreachable.  NOTE: We will not make any changes to operation status when setting to down since for IPSLA this means we CAN poll the results, but the operation itself could not collect data.
        •           
                
      •       
          
    •   
 

So, what about automatic dependencies?   While this isn’t in 10.1, the use of topology to provide automatic dependencies (or at the very least dependency recommendations) is absolutely the next logical step for us and something we’re exploring.    As we get further along with our research in this area, we’d love to talk you about what you’d like to see.    Please post a comment if you’d be interested in being involved in the feedback process!