Is there a way to pull a report or create a dashboard of servers that are missing data points on one or more components?
Issue is we are monitoring over 2000 servers in our environment and occasionally some of them stop fetching data or different components, like CPU/Memory utilization etc. however the node shows up as it responds to ping request. Usually restarting SNMP service on them resumes monitoring, but at this point its very difficult or nearly impossible for us to know which server is not providing such data points unless we accidentally bump into one or if someone specifically asks for a CPU/Memory utilization graph and we notice its not actually sending that data to the server.
Below is a screenshot of an affected server which was not providing data for few hours, if we restart the SNMP service we start getting data points.

Any help suggestion greatly appreciated .
thanks,
Rustum