Solarwind dashboard nodes on maps show down when indeed they are up

We have noticed when  solarwinds job engine V2 is started, solarwinds server is unable to reach nodes via ICMP and SNMP. This renders data on the dashboard to show down when indeed the nodes are up. Anyone has this challenge please and how was it resolved?

Parents
  • I havent faced this sort of an issue, as far as I understand I dont think the data on dashboard has anything to do with your job engine V2, the data on the dashboard is pulled from the SolarWinds database directly. You should probably do a complete health check of your environment once. To the best of my knowledge when polling is initiated your job engine kicks in and talk to the end devices to get the stats, but it will not overwrite the information in your database that is already there (which is status of the device be it up/down - that should still be intact, the status that was picked up in the previous poll). Hope this helps.

    Small correction with my statement: If you are stating your additional poller/main poller on which the device is managed - If that cannot ping (ICMP) the device within then definitely yes the device will be shown as down in SolarWinds even though the device is up and running. Make sure connectivity between SolarWinds and end device is intact and SolarWinds can reach the device within optimal latency.

  • Thank you Vinay. So the relation here is once the job engine starts, there is no reachability from the main poller to lots of the node. if you stop the service, all node become reachable with very low latency.

  • Honestly not sure how I would fix this, but let me list down a few things

    1. Firstly i would check if main poller is overloaded, if it is then I would free up some and move it to additional poller. Ideally i keep less number of devices on my main poller. This should fix the issue if it has anything to do with load or performance.

    2. I think if you raise a case with SolarWinds they would either suggest you to clean up sdf/db files on your poller or reinstall Job Engine service, but before you do that try if point 1 works for you.

Reply
  • Honestly not sure how I would fix this, but let me list down a few things

    1. Firstly i would check if main poller is overloaded, if it is then I would free up some and move it to additional poller. Ideally i keep less number of devices on my main poller. This should fix the issue if it has anything to do with load or performance.

    2. I think if you raise a case with SolarWinds they would either suggest you to clean up sdf/db files on your poller or reinstall Job Engine service, but before you do that try if point 1 works for you.

Children
No Data