This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

WMI causing Orion Web Console to crash

When a server goes down that is being polled using WMI, SW tries to poll that node on every port until it reaches port exhaustion and causes the orion web console to crash. This never happened on previous versions, but every since upgrading to 2020.2.1 I have had this issue any time a server goes down. Is there a way to make it so that SW stops trying to poll a node using WMI when it is simply unreachable? or is it just better to deploy the agent on to Windows servers rather than using WMI for this reason?

This is the DCOM error that shows up in the event logs when SW is trying to poll a Windows Server that is down:

MasonLan_0-1610637607791.png

Parents Reply Children
  • We are polling 970 nodes total split between two polling engines. 566 of those are servers. 321 are being polled using WMI.

    Main Polling Engine: 16 CPU(s), 28 GB Memory

    Additional Polling Engine: 4 CPU(s), 16 GB Memory

    SQL Server: 8 CPU(s), 64 GB Memory

    Both polling engines have a polling completion of 100 and the polling rate is around 50% on both.

    Do you think it would make a difference if I moved all Windows Servers using WMI over to my additional polling engine rather than polling them on my main polling engine where the orion website is hosted?

  • Those specs and stats look pretty good.

    I was thinking that polling from the web server was causing your problem. But do I hear you right, that WMI polling is from a different server than the web console?

    How many servers that you poll are down at any one time? If it's only a few, then this WMI thing could be a red herring. You may need to open a case to look deeper.

  • Sorry I may have said that unclear.

    Currently all servers using WMI polling are being polled from the main polling engine which also hosts the web console. It sounds like moving those nodes over to the additional polling engine might help resolve the issue?

    There are not usually very many Windows Servers down at a time, but every time the web console has crashed, I have seen lots of DCOM errors for a Windows Server using WMI in the event logs just like the one included in my original post. 

  • Well, I wouldn't guarantee that moving WMI polled servers from the server that hosts the web would solve it completely. You can probably stop the web console from crashing, but you may still get browser lag if there are servers polled from the web server that are down.

    My experience is that even a few WMI nodes being down simultaneously, and having multiple object being polled puts a load on the web server. I think web performance got worse with the 2020 releases, but I haven't had a chance to track anything specific down.

    So I think you may be better off opening a case, and getting to the bottom of it all.

  • Perfect. I will move WMI nodes to my APE and change polling to slightly longer intervals. I have ticket open and am waiting to hear back from SW as they are understandably busy right now. Thanks for your suggestions.