Hi,
Our environment has HA enabled. We have a service that keeps for some reason going into a stop state. When this does we can see that the HA services attempts to restart that server. When this fails due to it running past it's threshold time it fails over. Is there a way to blacklist the service? Or a different method that can be used to keep it from failing over?
Better yet if anyway knows why this keeps failing or which log I can look at to find out what's going on would be great. We are on 2019.2 hotfix 2 with plans to upgrade to 2019.4 hotfix 3. We do not see any specific documentations that specifically states if the upgrade would fix this. And we've been unable to get reassurance. So we are trying to figure out ways to mitigate this and stabilize HA from failing over when really there isn't a real reason or justification for a true fail over.
I can provide more information if needed. I've been digging into the logs but haven't had anything jump out at me. service failing: