We operate an environment that has a variety of teams that utilize SolarWinds, and are all responsible for the creation and implementation of their own application monitoring templates. One of our teams started using baseline thresholds for their alerting, and have gotten a lot of unwanted alerts. Additionally, they believe that the baselines are getting lost after SolarWinds maintenance.
- Is there anything that would cause a baseline to be outright lost? I understand they can change, but I'm specifically asking about any condition that would result in a loss of the baseline.
- What specifically causes an alert from a metric that uses baseline thresholds. My understanding is that if the statistical pool has an average (lets say 70), and a standard deviation of 5, then a value of 75+ would result in an alert. Is that a correct understanding of how this works, or is there other math at play?
- It seems like there are some use cases where you don't really want to use baselines. I'm thinking things like CPU where you just might not care about utilization unless it gets above 85%+, etc. Is this an accurate assessement?
- Likewise, are there some use cases where people have found baselines to be beneficial? I'm trying to suss out where they should (and should not) be used.