This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Polling & Time out settings

We have a number of monitors that appear to be timing out.  MOST of our monitors are set to have the polling frequency and the Time out set to the same.  {1 minute seems to be the normal setting}  HOWEVER, we also have a number of monitors that have a polling frequency of 1 minute with a timeout of 5 minutes.  This appears to be causing a number of timeout and false critical / down conditions {most of these monitors are powershell scripts}.

Is there a best practices guide or is there anyone who has an idea as to what the standard configuration should be?

  • Default polling interval and timeout values are 300 seconds, or 5 minutes. Depending upon the number of components in an application and their type, one minute polling may be unrealistic. You may also find that your poller is simply unable to handle the load, depending upon the total number of components being polled in one minute intervals. Have you looked at your polling details page under [Settings -> Polling Engines]? Are any of the highlighted values extremely high, possibly approaching 100%?

    Polling Details.png

  • My SAM Application Polling Rate is hovering around the 47% mark.  Other than that, the numbers look good.  The main question I am wondering is that is it typical to have both the polling frequency & and Time out to be the same?  OR to have the timeout GREATER than the polling frequency?

  • Hello,

    From historical reason the job timeout is multiply by 4 (this value is configurable in config file), so 5 minutes timeout means 20 minutes job timeout. If this job timeout is exceeded the application goes to "Unknown" state with "Job cancelled by scheduler." error message. So this job timeout doesn't cause any down or critical false alerts.

    Lukas Belza (SolarWinds development)

  • The short answer is yes, it's perfectly normal that the polling frequency and timeout values would be the same. In fact, this is the default behavior.

  • Any idea why our Powershell scripts timeout?  We seem to have the biggest issue with those monitors more so than any other.

  • PowerShell scripts can take longer to startup (spawn the PowerShell.exe process) load the necessary cmdlets, and execute than you've allowed with 1 minute polling. This is especially true where WinRM is involved. It's also important to note that the timeout value applies the the application as a whole, and not just one component monitor within the application, so applications that have a greater number of component monitors will take longer to run (and thus need a higher timeout value) then those that only have one or two components. The default values of 300/300 work best in most every scenario. I would recommend decreasing the polling frequency and timeout by roughly doubling what the values are now to see if this significantly reduces the number of timeouts you're receiving. 

  • We FINALLY were able to start investigate this issue in further depth.  It appears that the MONITOR is not timing out, however a piece of the script is timing out.  A number of our PS script make web calls via .NET.  The error that we have been able to trap is "web request timed out".  This happens within the script and at varying points.  This appears to be an issue with Orion and the Powershell component.

    We performed a number of troubleshooting steps and data collection that helps us prove this.  We ultimately wrote an application that runs the powershell scripts so that we could isolate the issue.  We now have in place a very basic PS script that run the standalone executable.  The executable runs the PS monitor scripts with detailed logging.

    Does anyone have any idea why our Orion server seems to be having issues with PS scripts?  We are running them locally vs. remotely.

    Thanks

  • Depending how many PowerShell component monitors you have assigned, you could be running into a PowerShell session limit; which offhand I believe is 5 simultaneous PowerShell sessions locally. Don't quote me though. There is a limit, off hand however I can't recall precisely the exact number Windows is limited to.

  • Each monitor had one PS component....HOWEVER.....there where 10 monitors.  Could this be an issue with the Orion server not being able to handle XX number of PS monitors within a certain polling period?

  • SAM has no such limitation. The only limitations related to parallel PowerShell session limits come from the Windows operating system itself. They are designed to limit potential misuse by worms/viruses, control resource consumption, and can be altered to increase this limit as needed on a machine by machine basis.

    PowerShell Limit.png