I'm pretty new to Solarwinds Orion. I have the SAM and NPM modules and we are trying to get alerting setup. Our needs a pretty simple, just tell us when a server or network device is down or has a problem and email us. I am aware of the concepts of alerting as I've used other monitoring software, but Orion monitoring has me confused. I have a few questions.
- When setting up a new node in SAM, you can specify thresholds for CPU, RAM, response time and packet loss. Are these supposed to generate alerts when the threshold is reached? How are these related to alerting if at all?
- I have site groups that I've set up. I setup each site group to depend on a router for that site. We recently had an outage of a router (power issue) and I got alerts for every object being monitored at that site, including ping latency, and ping loss. Am I missing something here, or was the group supposed to not alert if the main dependency went down? I double-checked and it seems to be setup correctly.
- I haven't changed any of the default alerts, but I am getting a lot of generic alerts for things. For example, I see a few "Node is is warning or critical state", but I have no idea what that is as there is no information in the Alerts page. This isn't very useful at as it doesn't provide any information, especially for an email notification. Is there anything I can do about this?
- Is there anyway to set alerts for only a group of devices without manually adding the devices to a group? For instance, I want ping latency monitoring for all my network devices and some servers we have. I don't want to include everything. The list of network devices may change and I don't want to have to update the group every time we replace a device. I looked into dynamic groups, but didn't see a way to tell it to only monitor certain devices.
- I seem to keep getting alerts for devices that we've already acknowledged there is a problem. Can I tell Solarwinds to not alert again for that device? Also, is there a way I can tell Solarwinds not alert on that device if it going up and down? Once notice that it is not working right is enough.
I've looked through documentation and watched some of the labs videos, but they don't explain a whole lot and the Alerting in Solarwinds is confusing coming from something else.