2 of 2 people found this helpful
If you want to change the polling interval for a single node, you do that on the Node Properties page. (go to node details, click node properties, adjust there)
If you want to change the polling interval for everything in the system (not advised) then go to the polling settings, but after changing it you have to click the button that says "apply to all" (or something like that - I'm going from memory)
For polling (meaning ping / up-down) monitoring the process is:
- one check (ping for the IP, snmp get for status for interface or volume) every 120 seconds (unless you change it)
- if that check fails,
- the system is put into a "warning" state
- SolarWinds does a "rapid ping" check, where it sends 1 check every 5 seconds until 10 in a row come back up or down.
- after 10 bad checks (50 seconds) the system is marked down
- now SW goes back to checking every 120 seconds.
- If that check comes back different (ie: "up" if the node is down) then the same rapid check occurs until 10 checks return a consistent result.
While you can change the polling cycle (change 120 seconds to something else - down to 10 seconds at the lowest) you can't affect the rapid check mode.
As for using one polling cycle during the day and another at night, you can 100% accomplish this using the Orion SDK. I wrote about it and did a video on it. Those resources are here:
My understanding is that the Node Warning Level (which also defaults to 120 secs) is used to determine the maximum time that a non-responding node will remain in Warning status before it is set to Down status. It has been explained to me that during this time, the node is polled every ten seconds until it either 1) responds, or 2) the 120 secs (or whatever the setting is) expires at which point the status changes to Down. We could both be wrong, but we can't both be right in our understanding of how this works. I have also understood that it is not a good idea to have a Node Warning Level setting that is greater than the polling interval.
What say you?
This is the problem with being old sometimes - you remember old things. I quoted what was true back when I looked into it, many moons ago. However, you are correct:
That said, the essence of what I was trying to say was still correct - you need to understand the polling/fast polling intervals to understand how long it will take for a node to be "down".
As I mentioned, both the polling cycle (for the system overall or for each node individually) can be tuned
As YOU mentioned, the "fast polling" cycle (for the system overall) can be tuned
we had a similar Problem with our Orion deployment reaching 96% of the maximum polling rate and we needed to add much more Nodes.
To lower the polling rate, we first tried to increase the time between the polls on the global settings. But after that, the warning was still showing up with 96% of the maximum polling rate reached.
I randomly checked some nodes and the polling rate was still on standard settings (extra polling settings ?).
Therefore, instead of reconfiguring every single node, I connected to the SQL Server Instance and change the polling intervals on the database table NodesData:
First I stopped all orion services while doing this.
Second I made a Full-Backup of our SolarWindsOrion database
You can check the current polling intervals with following statement (columns: RediscoveryInterval, PollInterval, StatCollection):
Change polling RediscoveryIntervals on all Nodes where it is set '30' to '90'
SET RediscoveryInterval = '90'
WHERE RediscoveryInterval = '30'
Change polling PollInterval on all Nodes where it is '20' to '60':
SET PollInterval = '60'
WHERE PollInterval = '20'
Change polling StatCollection on all Nodes where it is '10' to '30'
SET StatCollection = '30'
WHERE StatCollection = '10'
after that I started all orion services and randomly checked some nodes and the polling rate was set to the new values.
furthermore, after a new deployment heath check, the "Orion deployment reaching 96% polling rate" warning was gone.