I'm having an issue with a node on my system. On most mornings, the agent will become unavailable and then available again within 1 minute, and then the node status will do the same thing a minute later. What could be causing this and how can I fix it?
Assuming that it's just the one agent having this issue, and that you have other agents that are not having this issue, then it sounds as if the problem is on the agent side. Note if that assumption is wrong, then let me know.
1) Sounds like it also might be at nearly the same time of each day when it happens. Not every day but on the days it does happen it's within a narrow range, such as the same hour or so. If that's true then the most likely is that something else on the server is interfering. While this is a likely situation, it's also hard to debug... so you might want to come back to it.
2) The agent is just borked. Yep, it just happens sometimes. Uninstall and reinstall the agent.
3) Something is interrupting communications between the orion server and the agent. Assuming that you see actual down status in the availability graphs, then one thing to try is to switch back to ICMP as your status checker. On the node, "List Resources". Under "Status & Response Time", change that to "ICMP (Ping) - Fastest". Even if that fixes it, then also consider that #1 might still be true, and you still need to look into what else is happening at the same time as the brief outage.
SolarWinds solutions are rooted in our deep connection to our user base in the THWACK® online community. More than 150,000 members are here to solve problems, share technology and best practices, and directly contribute to our product development process.