I have a similar task for my company's maintenance windows. As you mentioned, disabled then re-enabling advanced alerts will cause any existing alerts to retrigger. More importantly, it resets the age of the alert which may affect SLA requirements. Instead of disabling alerts, I found that another way to avoid email floods was to simply redirect the email actions to an SMTP server that does not exist. The actions will still kick off (good for logging assets affected by the maintenance window), but will die in the ether when the SMTP relay cannot be reached. Below is the SQL query I use...
declare @oldServerName As Varchar(255)
declare @newServerName As Varchar(255)
set @newServerName = '0.0.0.0'
set @oldServerName = 'WHATEVER_THE_IP_OF_YOUR_SMTP_RELAY_IS'
update [orion].[dbo].[ActionDefinitions] set [Target] = REPLACE(CAST( [Target] AS VARCHAR(MAX)),'SMTPSERVER:'+@oldServerName,'SMTPSERVER:'+@newServerName)
WHERE Target like '%SMTPSERVER:%'
select * from [orion].[dbo].[ActionDefinitions]
Then when the window is over, swap the values for old and new and run it again.
Has anyone heard of a way to throttle emails out if the conditions cause a storm of alerts, in a global way on the server? I have run into situations (okay, caused is a better word) where thousands of emails go out due to the wording of the alert. My Exchange god can block them eventually, but is there a way for the NPM software or the server to catch the situation and stop itself, similar to the way that an snmp trap alert can be told to pause for so many minutes after seeing a certain number of traps?
Hi Steve, the above requirement looks like a typical maintenance requirement. Why do you want to take an approach with Alerting here ?
Why dont to use Unmanage Schedule Utility available ? Unmanage all the devices that are there in the list for the defined period of maintenance (this would take start time and numbers or hours you want to unmanage upto which is end date). Unmanage and Re-manage of the node is well suited for such requirements.
There are several others ways to achieve this apart from Unmanage/Remanage or alerting. Let me know if the above approach suits best of your scenario or not
Thanks in advance
Vinay,
I cannot speak for Steve, but I think that the Unmanage function has one major limitation. While it successfully suppresses alerts, it also stops all polling data collecting for the unmanaged nodes. I would prefer to have an option to chose whether to stop all monitoring and alerting or just stop alerting for specific nodes. Maybe we want to see how the specific nodes are doing while their being worked on, but don't want to get bombarded with alerts. This is my "two cents" contribution; I hope it helps.
SolarWinds solutions are rooted in our deep connection to our user base in the THWACK® online community. More than 195,000 members are here to solve problems, share technology and best practices, and directly contribute to our product development process.