This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Tips for Defining Groups and Dependencies - a running list?

At the risk of being presumptuous, I thought it might be valuable to condense some of the knowledge that has been disseminated so far regarding the great new feature in Orion NPM version 10.1 - Groups and Dependencies.

It is easy to see that this topic will be a source of much discussion for the foreseeable future, so how about us Thwack users start compiling a list of essential tips to remember as we use this feature?

I'll start with two that seem obvious and correct and others can add to it or correct me:

(1) A Group can consist of any number and type of elements, including other groups.

(2) In order for a child to be seen as "unreachable", ALL of its parents have to be down or unreachable. (Thanks Karlo!)

 

Anyone else care to add something?

  • borgan--

    This sounds like a great idea. I will speak with development and see if we can create a forum for it. Can't promise anything but I will check.

    Thanks,

    M

  • Hi borgan--

    Spoke with dev. and no forum is required for it so just keep posting threads here.

    Also, here's Karlo thread that provides a great description of Groups/Dependencies.

    M

  • I'll add one I "think" is correct.

    A parent cannot be a member of it's own child group. Makes logical sense. right?

  • I'll post another thing I think I have learned about groups and dependencies.

    Make sure your dependencies are correctly defined to match your actual network topology. If you don't, and Orion can find an alternate path to a child object that is down, the object will show as down rather than "unreachable" even if the defined parent is down at the time.

    Does that seem correct?

  • No not completely.

    This is accurate here.

    You have defined a parent-child relationship
    The Parent goes down
    From the Orion server there is another path in your network to poll the device and it is up
    We won't mark it unreachable, it will remain up 

  • Brandon is correct with the above statement and let me explain why:

    Node A is a parent of Node B but Orion can get to Node B through some route not through Node A.

    Node A goes down, but Node B does not so Orion can still ping it.  Node A Down - alert fired.  Node B Up - no alert.

    Now if Node B does go Down when Node A is down then Node B will be Unreachable and no Down alert will fire.  This is not accurate given the actual topology as now you don't know if Node B is actually Down or if the alternate route is down somehow, so you don't know to go check if Node B needs to be reset or you need to check some other part of the network.

    Hopefully this helps.

  • Thanks Karlo and Brandon.

    So, in the scenario where Orion has an alternate path to B when both A and B are down, the only way to get a down alert on B is by designating the alternate path as a second parent to B. In that case, one of two parents will be up, therefore allowing Orion to see B as truly down.

    Do I have that right?

  • If I may add one more thing to this discussion please?

    What is the step by step process by which Orion processes a node down alert with Dependencies in place?

    When a node is non-responding after the node warning interval expires, does Orion then immediately check to see if the node is dependent on one or more parents? Then if all parents are also down, set the status of the node to unreachable?

    Is that the order of things?

  • Hi Patriot,

    To answer your questions:



    So, in the scenario where Orion has an alternate path to B when both A and B are down, the only way to get a down alert on B is by designating the alternate path as a second parent to B. In that case, one of two parents will be up, therefore allowing Orion to see B as truly down.

    Do I have that right?



    Correct.  You will want to define the two parents either by two separate dependency definitions, or by creating a group for the parents and setting the status roll up of the group to be Best or Mixed and have the group be the parent of a single dependency definition. Similar to this



    If I may add one more thing to this discussion please?

    What is the step by step process by which Orion processes a node down alert with Dependencies in place?

    When a node is non-responding after the node warning interval expires, does Orion then immediately check to see if the node is dependent on one or more parents? Then if all parents are also down, set the status of the node to unreachable?

    Is that the order of things?



    The algorithm used for determining if a Node status should be Unreachable is this:

    1. Is the Node Down (we have already done Fast Polling and set the Node Status to Warning)? Yes - go to step 2  -- No, exit algorithm
    2. Does the node have any parents? Yes - go to step 3 -- No, Keep the Node Status as Down and exit the algorithm
    3. Are all the parents Down? Yes - set this Node's status to be Unreachable and exit algorithm -- No, go to step 4
    4. Have we waited an additional polling cycle to be sure the parents aren't Down? Yes - Keep Node Status as Down -- No, Set the Node Status as Warning and wait until the next poll

    Once we write the new status to the database then the next Advanced Alert poll will pick up the change and alert on any Down status Nodes.  Basic alerts will trigger almost immediately after the algorithm finishes.

    Let me know if I can explain this better.

  • Yes, Karlo that helps, but Step 4 is new to me. Are you saying that if a child has more than one parent, and all of them are not down ,that Orion will wait another polling cycle to verify the status of each parent before setting a status for the child?

    After you answer that one, I have another question. Thanks for your patience.