I'm sure it's been discussed to death - but after looking at the sub FAQ and perusing subreddit search results for 10mins I didn't find anything helpful.
My org has been using Solarwinds Server and Application Monitor (SAM) for a long time; it works adequately but fails in some areas.
I'm looking for suggestions of potential replacements that could have the following capabilities:
Not Nagios (used it before, seems great for a smaller env, and great for a large one if you have an FTE or two to run it)
Able to scan an IP range and automatically add any found devices as a monitored system
Ability to scan AD OU and automatically add any found devices as a monitored system
Ability to look at vCenter and automatically add any unmonitored VMs as monitored systems
Ability to run powershell commands against it to add/remove monitored nodes, mute/unmute alerting, configure node settings, configure node data
Ability to send customizable email alerts when a node down/critical state is encountered; and when the alert state is cleared (or ability to call API scripting when an alert state is detected/cleared)
Ability to monitor Windows 2003-Current (preferably agentless), AIX, Linux (RHEL and Ubuntu), SQL 2012-Current, Exchange, ESXi/vCenter, AHV/Prism/Prism Central, HPE, Dell and Nutanix hardware
HA capability
Scalability to monitor ~4000 systems with multiple monitored components on each system
I know I'm probably looking for a lot, but we get nearly all of this out of Solarwinds SAM (don't seem to get AHX/Nutanix monitoring, or API scripting on alert state detect/clear); and the system goes ass up every month when we reboot the SAM servers for windows updates.
I don't see what in that list Solarwinds does not currently do.
I am also not aware of something else in the market that covers all these things.
If you Solarwinds install dies every time you install Windows Updates, then it sounds like your install is sick.