cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post
Level 12

Alert for response time - use average or current?

Working on an alert for response time using NPM v10.7. If I am correct, the warning and critical response time thresholds that I can set on the Node edit properties page are related to "current" response time. If my trigger condition is pointed to the variable Response Time and the Trigger must be sustained for 4 minutes (2 polls), why am I not seeing any alerts when I have a node that has been above the critical threshold for several polls in a row?

0 Kudos
13 Replies
Level 10

aLTeReGo​, what's the difference in the average reponse time and current response time. What's the time frame it takes average of?

Also, I could see below two column names in the alert trigger configuration. What's the difference in these two?

pastedImage_0.png

Thanks

Amrita

0 Kudos

Depends upon the context and where it's used. If you select this metric in Alerting, the last polled value is used, so no Average is applied. if you select this same metric in reporting and apply a timeframe like last 30 days, then the value is averaged over the last 30 days.

So If I have to create an alert on "If node response time is higher than 20 ms for last 30 minutes" what do I use? Current response time? Does it store the current response time values database for each poll or just last poll and then starts averaging it?

if there is only 1 value for current response time then my alert condition of if higher than 20 for X minutes will never fire alert and can be good only for the last poll condition.

if it is on average response time then I am afraid the averaging of spike would never capture spikes for consistent period of time.

Please suggest.

Thanks

0 Kudos

Correction to my statement above. A quick peek at the source code reveals that Min, Max, and Average Response Times are calculated based upon the last 10 polled values.

0 Kudos

So as long as my wait time is less than 10 last polls, I can use current response time? Which from the below should I be using if I am polling

Response Time Capture.JPG

Also confused between:

Response Time Capture 1 .JPG

It is a critical alert and I dont want to answer my management on why the alert did not trigger when needed. Your help will be appreciated.

Thanks

0 Kudos

'Response Time' is the last polled value. The Average, Minimum, and Maximum are based upon the last 10 values. Depending upon what you're most interested in, you will need to choose the value that's best for you. 'Average' is the most common as current 'Response Time' and Min/Max values can be heavily influenced by a single value.

0 Kudos

So if I need to wait for 30 minutes (and I poll every 2 minutes), current response time is out of question since it only stores last polled value and is meaningless for my alerting.

Using Average is not the best because it can always bring down the consistently high value.

Also in my screenshot above I see 2 Average response time value, which one is more suitable for the condition of alert I have already explained above - If node response time polled value is consistently higher than 20 ms for last 30 minutes.

0 Kudos

yes, that's right!

Since, both 'average response time' and 'current response time' variables are available for setting up trigger condition, its really confusing for what the 'average response time' implies here.

Thanks for the quick help !

0 Kudos
Level 17

Would you perhaps be able to post a screenshot of the trigger condition?

0 Kudos

Well I finally got an alert but I should not have. The current response time in the alert message was 349 ms, which is NOT greater than either the warning or critical thresholds assigned on the Edit Properties page.

So why did I get an alert?

0 Kudos

test image.jpg

0 Kudos

Here is the chart of response time data for the last hour

0 Kudos

Here it is:6-6-2014 5-01-45 PM.jpg

0 Kudos