We ran into a situation last night that I am having problems figuring out why this happened. We have a regular "Node Down" monitoring created. On top of that, we also have group monitoring setup that combine couple of those nodes together and send an alert when both of the nodes are down.
Our network engineer was doing some maintenance and scheduled maintenance mode on 2 nodes. I can see it in the log that he in-fact did that. Now, we also have a group that is setup for those 2 node to alert if BOTH of them are reporting as down at the same time. No problems there...however, during the maintenance, we still received an alert from the group, because the group was reporting as "down". Why would these nodes report anything if they were placed in "maintenance mode"?
Thank you for any insight...
That is definitely a logical perspective if the group only contains objects of the same type in the same role. However, I think groups are also used more broadly by some people. Just for an example, some people use groups to show all physical devices at a site location. If the status indicator of this group were to stay green, even when a muted object is down, it wouldn't precisely reflect the state of the site as a whole.
As an aside, this example brings us to a philosophical part of the conversation: should a green LED indicate that "there are no alerts" or that "everything is good"? I'll bet @adatole or @aLTeReGo has interesting thoughts on this!
Maintenance mode could actually be two completely different things, unmanaged or muted.
Muting stops the specific node object from triggering any alerts, but they still get polled and their statuses still change as normal. The group is a different object and so it still had whatever alerts you set up for it. There isn't currently an option to mute a group.
Unmanaged in this case probably would work the way you had hoped, since both nodes would have changed to unmanaged and so the group would have inherited that same status as well, as long as your alert doesn't also trigger on unmanaged groups.
Thank you for your quick reply. I should probably mention it in my original post. I understand the difference and I would like to retain the polling, hence why 99% of the time we place nodes in maintenance mode by muting them instead of un-managing.
What I am reading is, despite the node being placed on MUTE, it will still fire alert when it is in down state if it is a part of the group? That doesn't make sense to me. Wouldn't that node be muted everywhere, even if a node is part of a group?
Also, there seem to be an option to mute a group, if I go to Manage Groups, hover over the group, select "Command" and then click on "Mute Alerts"
No the node should not trigger an alert, but the group is another object completely and the group fires its own alerts. I didn't dig into that part of the UI to check for muting groups, but id thre option is there then that would solve the issue. I don't know if that button will mute the nodes and the group or just the group so you'll want to test it.
For me it's doesn't make sense too. If you are muting the alerts on a node Solarwinds should be able to detect with a comparison if a group member is muted or not. If a node member is muted Solarwinds should dismiss the alert over group.
SolarWinds solutions are rooted in our deep connection to our user base in the THWACK® online community. More than 150,000 members are here to solve problems, share technology and best practices, and directly contribute to our product development process. Learn more today by joining now.