Greetings All.
Running a Solarwinds Self-Hosted solution with version 2024.1 . One MPE with several APEs. Some of the APEs live in remote sites (remote = accessed over a WAN link).
Chasing a problem where every few days Solarwinds slows to a crawl and eventually stops working completely (errors generated navigating between screens).
Looking under SAM >> Orion Observability - Main Polling Engine we have a component monitor called TCP Port Usage Count.
Reviewing TCP Port Usage over several days we see a pattern where the port count goes up and up, and finally stops reporting to itself (i.e. a gap in the graph data). The high TCP port count and gaps correspond to the times that Solarwinds "rolls over and dies" (our team descriptor for the problem). Rebooting the MPE restores SW functionality to normal.
A team member has opened up a support ticket with Solarwinds to chase the issue from that angle.
I welcome any ideas from the community for what might cause an ever-increasing number of TCP ports in use by an MPE.
Thanks!