i would concentrate on efforts as to why the automatic Orion managed schedule for the DB maintenance is crashing.
Check the swdebugMaintenance.log in c:\ProgramData\Solarwinds\Logs\Orion after a nightly run and see what issues are being logged. I suggest you increase the level of logging first using the Log adjuster utility on the Orion server so you get increased visibility.
It could simply be a permissions issue or internal application issue, but the log will be the place to help identify.
Installation | Consultancy | Training | Licenses
I didn't know there was an automatic schedule? Where do I see this?
All I know is it starts to crash on random pages (usually OK after a refresh or two) - progressively gets worse until we run DB maintenance manually.
We also run Solarwinds in the most complicated infrastructure ever, as we have a single instance serving 3 regions, each with their own poller. It seems that the region its hosted in have less problems like this - the issues seem to affect the other regions a lot more, where the database is not locally hosted.
I don't know if this has any impact on what the possible cause could be. I also know that we couldn't completely resolve the crashing issues when we had a ticket directly open with SW.
We used to frequently get notifications that our database needed maintenance, even with the nightly maintenance scheuled. We found a hot fix to address this. Orion Platform 2015.1.2 Hotfix 5 - SolarWinds Worldwide, LLC. Help and Support
Specifically under HotFix2, but any HotFix beyond should address it (assuming it is the same issue we had). Regarding where it is, under settings, Polling Settings there is a Database section with an option to set the time to perform the maintenance. Hope this works for you.
We contemplated moving some of our pollers "out into the field" away from the NPM and database, but opted to keep them all centrally located. While we don't get the "local latency" in the reports, we just have to know our WAN latency is there as a baseline. We had a lot of discussion on this and decided that the performance on the backend was too important to take that risk. Plus the APE can't do anything if it loses connectivity back to the main data center, so it didn't buy is much from that standpoint. I guess the benefit would be if the APE is offline, there wouldn't be a lot of alerts if the WAN link goes down.