Below I describe 3 different ways of composing a "Node Down" alert. Based on your experience, which way would you choose to compose an alert trigger for a group of multiple nodes.
I can see +/- aspects of each method.
Option 1
Create a custom property on the node called "GroupName" and set it to "HostGroupName"
for all the nodes that run this applicaiton
Type of Property to Monitor : NODE
- Trigger Alert when all of the following apply
- Node Status is not equal to Up
- Node.CustomProperty.GroupName contains HostGroupName
Option 2
Set the trigger condition to match on the specific list of nodes.
Type of Property to Monitor : NODE
- Trigger Alert when all of the following apply
- Node Status is not equal to Up
- Trigger Alert when any of the following apply
- Node Name is equal to node1
- Node Name is equal to node2
- Node Name is equal to node3
Option 3
Set the trigger condition to match on the nodes that run this application
Type of Property to Monitor : APM: Application
- Trigger Alert when all of the following apply
- Node Status is not equal to Up
- Application Name contains to AppName
Discussion Option 1
In order to make option one work for cases where a host might belong to more than one "GroupName", the trigger condition that tests "Node.CustomProperty.GroupName" must be composed using the "contains" test.
By doing this, multiple "GroupNames" could be stored in the single custom property given appropriate delimiters.
Example:
Set: Node.CustomProperty.GroupName = [email] [ftp] [www]
Test: Node.CustomProperty.GroupName contains "[ftp]"
Discussion Option 2
This trigger set uses explicit node names in the evaluation of the trigger.
Question: If node1 AND node2 go down, will 2 seperate alerts be generated ?
Discussion Option 3
By using the application template as a grouping field, the trigger is dynamic regarding which nodes are alerted on and simple to understand
A problem occurs when formulating the alert actions. Since the trigger is matching on an "APM: Application" property,
all the normal variables used in the alert texts will need to be fully qualified.
Example: ${Caption} references in emails and syslog messages must be changed to ${Node.Caption}
In reality, I believe that all variables should be fully qualified ALL the time. This reduces the confusion when authoring alert texts and using a field like ${Caption} and having the alert contain the name of a volume instead of the name of the node.