With our windows systems we have them all on agent monitoring. And recently we've noticed an increase in alerts with disk space being fully consumed. When we checked the Cortex part of all agents had over 30gigs of disk space consumed and in many servers this was enough to consume all available disk space. Meaning solarwinds was crashing our tools and servers because the cortex service out of the blue decided to crap out.
More specifically the files seem related to cache for volume polling. We've already opened two tickets with support but aren't getting anywhere with support so I wanted to ask the community.
The cache files appear to be DB files. That balloon out of control. And as far as we've been able to tell the agent doesn't lose connection so we can't understand why it's not flushing this cache and causing this problem to happen. Support had given us a script to run along with reboot of the agents and we did step by system what they recommended to no avil. The issue continues.
I feel like we just scratched the surface instead of attacking the root of this problem.
When we chart disk space we see slow steady increase till one day all disk space is taken. Temporarily we have been deleting the files manually. This seems to partially make the agents unstable because as soon as we delete the files it says that it cannot flush the files because it doesn't exist. Which doesn't make sense if the file is only growing.
I can provide more details if needed.