Ever since upgrading last week from 2019 to 2020 when we have a node go down it will send the event "x has stopped responding (request timed out)" every 3 minutes forever until the node comes back up. also the "Node is down" event often does not trigger. Right now i'm looking at a node that is hard down but its in warning and has been for 30 minutes instead of on 2019 where it would have shown as down within a minute.
When a node is down i don't want it to populate the event log with "Request timed out" every 3 minutes.
and why isn't the "node is down" event triggering when the node is down?
Thank you very much for your help!
EDIT: i've attached a photo which i think outlines the issue better. See how long it took to send Node is down and how many time outs it got before that?
Node warning level is set to 120 seconds.
And here's the screenshots you asked for. its just the default "Node is Down". Also i added a swql query image where you can see the eventtype 1 triggering multiple times on the same node and the time stamps. i removed the message so hide our node names is all.
Thanks for taking look at this. its been driving me a bit crazy for a week. also i can look down on that query to how it behaved before the 2020 update and its normal.
I've seen this when the next poll time is set to a date in the past. can you look at the Nodes table in the DB for a node experiencing this and see what its next poll time is?
I checked several of them and their next poll times currently are all in the future.
I'll check this if i see the issue again. 2 days ago i changed node warning level from 120 seconds to 90 seconds and i haven't seen this issue happen again yet. I have no idea if that fixed the issue or it was something else but so far so good.
SolarWinds solutions are rooted in our deep connection to our user base in the THWACK® online community. More than 150,000 members are here to solve problems, share technology and best practices, and directly contribute to our product development process.