This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Trouble with dependencies not working correctly

We have setup dependencies for over 1500 of our stores with the following network devices: Host, VM, SD-WAN appliance, AP, FW, and Switch.

The dependency typically looks like this:

ParentChild
SD-Wan applianceSwitch
SwitchSwitch Children GROUP (members include the host and AP(s))
Host

VM

We have a servicenow integration that creates tickets for our NOC when these devices fail, so when a store goes completely down (loses internet connectivity) and a dependency doesn't work correctly, they get flooded with tickets which is very problematic. When the SD-WAN appliance goes down (the parent of everything), the child nodes don't go into an unreachable state. Sometimes moving all the related store nodes to the same polling engine will kick everything into an unreachable state, but by that time it's too late - the tickets have already been created.

Is there anything i can tweak to fix this?

  • Do you have the SD-WAN as the Parent of your Child Group of AP's and Hosts?

    If so, what are the entity counts on your polling engines (specifically the one you are moving nodes from); and, if you edit your groups and check the update timing on your groups are they set for more than the default 1 minute?

  •  thanks for your reply.

    The SD-WAN (CGX) is the parent of the switch, and then the switch is the parent of the Switch Children Group like so:

    rmullal_0-1594324693989.png

    I would expect everything under the CGX (we don't alert off the FW, just some store's APs are plugged into the FW rather than the switch) to be unreachable.

    I just spot checked a few groups, and the refresh frequency is 60 seconds. is that ok? or do you think it should be less time?

  • 60 Second group refresh is the default, and fine to work with. So long your Node Warning Level is 120 seconds still (which is also the default). 

    Just make sure there are no other bigger groups of nodes that have been set up as a parent for any reason as well. Also, check the Node details page of one of the children and make sure your expected parent entity shows properly in the 'All Dependencies for this Node'.