This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

False Node Down After Agent Becomes Available

Hello,

I'm having an issue with a node on my system. On most mornings, the agent will become unavailable and then available again within 1 minute, and then the node status will do the same thing a minute later. What could be causing this and how can I fix it?

Many thanks.

  • Assuming that it's just the one agent having this issue, and that you have other agents that are not having this issue, then it sounds as if the problem is on the agent side.  Note if that assumption is wrong, then let me know.

    1) Sounds like it also might be at nearly the same time of each day when it happens.  Not every day but on the days it does happen it's within a narrow range, such as the same hour or so.  If that's true then the most likely is that something else on the server is interfering.  While this is a likely situation, it's also hard to debug... so you might want to come back to it.

    2) The agent is just borked.  Yep, it just happens sometimes.  Uninstall and reinstall the agent.

    3) Something is interrupting communications between the orion server and the agent.  Assuming that you see actual down status in the availability graphs, then one thing to try is to switch back to ICMP as your status checker.  On the node, "List Resources".  Under "Status & Response Time", change that to "ICMP (Ping) - Fastest".   Even if that fixes it, then also consider that #1 might  still be true, and you still need to look into what else is happening at the same time as the brief outage.