As it has been explained to me, when looking at the "Status" of a Nutanix cluster, the status is not reflecting the state of SolarWinds monitors, it is showing the state of the 800+ Nutanix monitors summed up in a single API call to the cluster for "Health Status". I would like to be able to configure how status is being reported. Example: Cluster Status Via API or Cluster Status From SAM Monitors.
We use Solarwinds to create system health dashboards based on basic criteria we choose. For our day to day monitoring, we care about the CPU, RAM, and Disk utilization numbers and would like to see, at a glance, that those numbers are within parameters. Unfortunately, what we see is the health status of the cluster based on all of the nutanix monitors. In the case below, the second cluster is perfectly healthy. Logging into the Nutanix Prism Element console for that cluster shows that we have a single VM with a misconfigured NIC. Yet, Solarwinds is showing the whole cluster in a critically unhealthy state.
We rely on Nutanix Prism Element and Prism Central to monitor for the minutia. We would like Solarwinds to only report on the items that Solarwinds can actually see and the criticality can be adjusted.
Because we can't configure this, the Solarwinds Nutanix Cluster Status is being ignored by the techs looking at the dashboard because 99% of the time, the warning and critical status is exaggerated.
All hosts are healthy. Cluster health only shows as "Critical" with no explanation why it's critical. At this point you need to log into to Prism Element for the cluster to see what is causing the critical state. In this case, a single VM has a NIC misconfigured which has Solarwinds reporting the whole cluster in a critical state.