This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Problems with alerts after conversion to web based alert manager

Hello,

We recently upgraded our Solarwinds NPM and upgraded the alerts to the new web based manager.

Since we did this one of our alerts is acting weird.

We have an alert that emails us when a nodes receive or transmit bandwidth reaches 90%

The alert still works fine for the trigger condition of 90%

However since upgrading this alert now triggers randomly even though the trigger condition hasn't been met and so we receive emails for nodes at 0% or 17% utilisation.

Capture.PNG

Has anyone else had this problem?

Thanks

Luke

  • the best way to troubleshoot is to include actual values of the interface utilisation into body message. This will be your trigger values. Can you confirm variable you have used to see actual utilisation (it will be in the form of ${...}) and also confirm what values are reported back at the time of the alert

  • here is body HTML script to start with:

    <b>Transmit</b>

    |__ Bandwidth: ${Interface.OutBandwidth as Bandwidth}

    |__ Current: ${Interface.Outbps as Bandwidth}bps, ${Interface.OutPercentUtil}%

    |__ Discards: ${OutDiscardsThisHour}

    |__ Errors: ${OutErrorsThisHour}

    <b>Receive</b>

    |__ Bandwidth: ${Interface.InBandwidth as Bandwidth}

    |__ Current: ${Interface.Inbps as Bandwidth}, ${Interface.InPercentUtil}%

    |__ Discards: ${InDiscardsThisHour}

    |__ Errors: ${InErrorsThisHour}

  • Hi Alex

    Name of Action

    NetPerfMon Event Log : Interface ${NetObjectName} on node ${NodeName} received at ${Interface.InPercentUtil}% of its utilization, which triggered this alert. ${SQL:SELECT Macro FROM NetFlowAlertMacros WHERE ID='InInterfaceDetailsLink'}

    Message to send to Network Performance Monitor Event Log

    Interface ${NetObjectName} on node ${NodeName} received at ${Interface.InPercentUtil}% of its utilization, which triggered this alert. ${SQL:SELECT Macro FROM NetFlowAlertMacros WHERE ID='InInterfaceDetailsLink'}

    It then sends an email with the body as follows:

    font face="calibri" size="3"><font color="red">${NodeName}</font> received at ${Interface.InPercentUtil}% of its utilization, which triggered this alert. ${SQL:SELECT Macro FROM NetFlowAlertMacros WHERE ID='InInterfaceDetailsLink'}</font>

    Thanks for your help

  • Hi Luke,

    Would you paste my previous HTML extract into email body of the alert and once it has triggered - paste here the actual email body so that we can see actual values. Also, switch email body type to HTML (there is radio button there)

  • Hi Alex,

    I will try that now, Thanks

    It's not just the email alert that shows the wrong % utilisation.

    The actual alerts in the event log show it too, see below the 0% and 3% alerts despite the rule set at 90%

         16/07/2015 11:42     Interface FastEthernet0/0 · Verizon Business MPLS Circuit; VPN: RensburgSheppards; SITE: gui on node rensbu-guildford-2271465 transmitted at 0% of its utilization, which triggered this alert.   
         15/07/2015 19:16     Interface FastEthernet0/0 · Verizon Business MPLS Circuit; VPN: RensburgSheppards; SITE: gui on node rensbu-guildford-2271465 transmitted at 97% of its utilization, which triggered this alert.   
         15/07/2015 00:31     Interface FastEthernet0/0 · Verizon Business MPLS Circuit; VPN: RensburgSheppards; SITE: gui on node rensbu-guildford-2271465 received at 96% of its utilization, which triggered this alert.   
         14/07/2015 18:05     Interface FastEthernet0/0 · Verizon Business MPLS Circuit; VPN: RensburgSheppards; SITE: gui on node rensbu-guildford-2271465 transmitted at 0% of its utilization, which triggered this alert.   
         11/07/2015 14:42     Interface FastEthernet0/0 · Verizon Business MPLS Circuit; VPN: RensburgSheppards; SITE: gui on node rensbu-guildford-2271465 transmitted at 3% of its utilization, which triggered this alert.
  • Hmmmm, yep, I can see now.... Let's take for example the one at 16/07/2015 11:42 - can you navigate to your interface utilisation chart and click on [export]. Then select Time Period {Last 7 Days} and Sample Interval as less as possible. Then click on Export to HTML at the bottom and once loaded - analize every single polling probe around this time. It maybe that you have had spike which has triggered alert, but the actual reported value came from the next poll... Just speculating...

  • Ok so just had a look at what you mentioned and there are no spikes anywhere near the 0% alerts, the lowest sample interval I can set is 5 minutes.

    I'm thinking it has to be a bug

  • ye man, something is not quite right... I guess your next move it to log a ticket with support.

  • Had lots of problems with switching from the old alerting to the web alerting. Mainly to do with the variables used to the original alerts not being compatible with the web alerting and having to enter email addresses again in the actions if they were separated by a semi-colon. 

  • OK thanks for trying Alex

    I will submit a ticket to support.