Hyper-V hosts stop polling until manually hitting 'poll now' link

We've been having issues with polling Hyper-V hosts with VMAN over the past year. After a couple of SR's with support we were advised to upgrade to 2022.3 (which was still pre-release at that point). We finally did that recently (actually 2022.4) but I had to manually hit the 'poll now' link for each host before they started working. Since then we had a similar issue with all the hosts in a cluster that was upgraded, and again polling stopped working until I hit 'poll now' for each host. Has anyone else seen this behaviour?

  • Look for if the Nextpoll date is in the past. That's a scheduler bug if so, and probably a sign of some other poor health in the environment

    Also worth checking if the performance poll is set to a stupidly long duration (4 hours or so) as that seems to be common

  • If the problem continues to present itself, please open a new ticket with support so we can investigate further. That is certainly not expected or by design behavior. If you need to, feel free to ping the ticket number back to this thread and I can check on it internally.

  • The Nextpoll date seems to be fine. We have had this issue crop up on yet another server, so it continues despite our best efforts. 

    I'm not sure what you are referring to in terms of 'performance poll'. That isn't a setting I'm familiar with; can you elaborate?

  • The issue has cropped up on another host, so I am opening a ticket (Ticket #01263810). 

  • There's a Status poll, a performance poll, and a topology poll - For whether you're trying to find out What's UP, what's hot, or what's where. The performance poll actually does most of the stuff you care about for virtualization (usually i've not got much hyperV in my env at the moment)

    Check these places:

    Table: SELECT TOP 1000 VCenterID, HostID, NodeID, PollingTaskTypeID, PollingInterval, JobTimeout, APITimeout, LastPoll, LastPollStatus, LastPollStatusMessage FROM Orion.VIM.PollingTasks

    Look for large numbers and if there's an error. A long topology poll is OK, a long stats/status poll is not

    Perfstack: Select like 5 days or something. Add the metric you care about, as you move your cursor over the timeline if it doesnt update position regularly you're not getting much data checked

    Edit [object] page

    Check everything with Poll and a number.

    I think there's another good place to look, but worth a shot, been a while.

  • Looping back around to this; I did open a ticket but we weren't able to find a lot at the time. That said, this behaviour seemed to decrease every time we upgraded the Solarwinds environment. As of now (v2023.3) the behaviour has stopped altogether, so I figure it was related to a bug in the platform.

    I did setup an alert at some point, checking for "Virtual Host status = Could not poll" so it became pretty simple to manage the issue. But that alert hasn't triggered for quite some time now.

    So, all's good.