This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

SRM Critical/Warning events triggering control missing feature

we recently deployed SRM in our environment and found that our dashboard is flooded with Critical and warning events for each category of storage arrays. 

we attempted to set the threshold based on dynamic baseline and the number of triggered alerts did not seem lower. the problem occurred because the SRM compares every second and  logs events as critical/warning. Since there are many storage the dashboard is always cluttered with very high number of these not useful info and overload my SolarWinds server and DB. 

There is no way to control the triggering of these events by placing a wait time nor the ability to turn off like the alerts settings.  Presented this case to SolarWinds support but they are unable to help us and recommended us to submit a feature request. 

So this post is a 2nd post after my  original post three days ago on feature request page for comment. I post it again for wider domain to get opnion and comment on this problem.

the feature request is the ability to control event triggering on SRM by including a wait time and/or ability to disable.  Further wanted to see how the SolarWinds community cope this problem. 

Parents
  • Yes, they already have this ability in their SAM application components and they recently added it to Node thresholds (like CPU, Memory, etc...).  They're also rolling it out to Volume and Interface thresholds as well, so hopefully yeah, they bring it to SRM.  I wouldn't be too hopeful that it will be coming soon though, because I doubt SRM is high on their priority list because it isn't nearly as popular as NPM.  

    My hope is that eventually every single threshold in every Orion module has a time-based modifier available.  

  • Thanks for your comments and info.

    It is good to hear that they are working to add this feature. What would be the point to have two engines comparing actual performances with thresholds. i.e. if an alert is already available for a given metric, there is no need to compare and trigger an event. The alert gives much degree of customization based on the environment.

    I hope also taking into account SRM advantage, in that it offers much granular storage performance info linked with servers and applications performance to troubleshoot problems and identify service degradation root causes via Perfstack analysis, the roll out will get a considerable wait .

Reply
  • Thanks for your comments and info.

    It is good to hear that they are working to add this feature. What would be the point to have two engines comparing actual performances with thresholds. i.e. if an alert is already available for a given metric, there is no need to compare and trigger an event. The alert gives much degree of customization based on the environment.

    I hope also taking into account SRM advantage, in that it offers much granular storage performance info linked with servers and applications performance to troubleshoot problems and identify service degradation root causes via Perfstack analysis, the roll out will get a considerable wait .

Children
No Data