This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Dependency groups and multiple pollers mixed results when parent group down gives nodes down and nodes unreachable

We created dependency groups for all remote sites. We use multiple pollers. We encountered the following issue / problem: When parent group goes down, we get a mix of nodes down and nodes unreachable. It seems that if the child nodes are not on the same poller as the parent nodes, This issue will show up. Is it a BUG or just assumed functionality in SW that is not there (yet)?

RJJ-ISSUE-005

  • Response support: SWISS versions should be the same. In our case all version the same. What next?

  • I want to make sure that we are talking about the same thing.

    • You have multiple polling engines. (Call them Orion1 and Orion2)
    • You have a two groups with a dependency between them (call them "Parent Group" and Child Group")
    • You have Nodes which are assigned to both polling engines. (Node1A, Node1B, Node1C are assigned to Orion1 / Node2A, Node2B, Node2C is assigned to Orion2)
    • In the groups, you have a mix of nodes ("Parent Group" has Node1A and Node2A / "Child Group" has the rest)
    • When the "Parent Group" goes down, the members of the child group are in various states (but should be as "Unreachable).

    Is the above correct?

  • Hi KMSigma, (and thx for your time)

    Almost, it should be:

    Parent group is P1 with Node1A

    Child group is C1 with Node 1B, Node 1C, Node 2A, Node 2B and Node2C

    When Parent group P1 is down,

    Node1B and Node 1C are unreachable

    Node 2A, Node2B and Node2C are down

  • At first glance (without testing) I'd think this is a bug.  Is this for Orion + Additional Polling Engines?  (I'm trying to rule out Enterprise Operations Center as a potential culprit)

  • Interesting dilemma. I have a similar loadout but have not seen that issue. I do not split nodes for a remote site among different pollers by habit though. If I need to offline a poller, then I will split those nodes up. Unless you have HA for pollers, that is about the best you can do. Based on the examples below, It seems that the node in the parent group did not(?) get marked as unreachable? I would consider recreating one of the groups (and child grouips) just in case something funky happened with it in the db. Similar to the occasional node that wants to be removed and readded - rare but it happens.