Using VMAN (2020.2.5)
I'm trying to get to the bottom of some "VM No Heartbeat" alerts.
I have created a copy of the alert, without the Powered Off check, so any missing heartbeat should alert - the idea being that if someone reboots a server I will know about it.
Eventually I want to give it an elapsed time before firing of 'say' 10 minutes, so that if a server is rebooted during windows patching and doesn't come back, we will get an alert.
While testing, I'm finding that I am getting some alerts for servers which are not being rebooted.
The evidence is that
1. The OS Uptime is longer than the delta between now and the alert
2. I am not getting the VM Rebooted alert (which should fire when uptime of a VM is less than 30 minutes).
3. Support have send me event logs with no evidence of reboots
Having looked at the performance analyser graphs, what I do notice is CPU etc. are reported at the time of the alert, but network traffic drops to zero.
So I'm thinking this is something to do with connectivity.
To prove this, I need to know what the VM Heartbeat actually is, and searching the forum / googling doesn't tell me unless I have managed to miss an article on it.
Could someone help with what it's doing
Is my best option to manage all my VMs using ICMP and configure a copy of the Node is Down alert for them?