This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Can not remange node - stuck

I've seen this twice now in our environment where we have nodes that gets in a stuck state that we can neither manage or unmanage.  They seem to do this after they have been put into a unmanaged schedule then somehow they are un/remanaged again by our users.

For example (below) a admin asked why they were paged when they unmanaged a node and did some reboots.   I saw the node was showing as managed (green button not blue X) and it alerted when it went down due to the reboots.  The Node only shows we can remanage it and clicking on remange does nothing.

unmanged..JPG

I have been getting around this by bouncing services on the poller that is responsible for the node via Orion Service Manager.  I dont like this workaround and was wondering if anyone has seen this or knows of a solution?

We are running: Orion Platform 2015.1.0, SAM 6.2.0, QoE 2.0, NCM 7.3.2, NPM 11.5, NTA 4.1.0, IVIM 2.0.0

Thanks

  • We were getting something similar after updating to 11.5 - 11.5 - un/remanage broken?

    The steps provided by SteveWright in his march 4th post in another thread, NPM: Audit Trail - missing, fixed the un/remanage problem for us as well as the auditing issue.

  • I've encountered this a few times. Its usually resolved by changing the assigned polling engine or modifying the status through the database. I'd try changing polling engines.

  • Thanks Jaybone, it seems just step #1 (bounce services) works, but Id hate to keep doing that if this becomes a weekly event. Sorry its happening to you as well, but at least this may get fixed quicker if there is more attention.

    The next time this happens Ill open a case and reference our threads.

  • Dustin, Thanks Ill give that a try next time it happens. Thankfully we have 6 pollers to pick from.  It seems to take a few weeks between occurrences.

  • We have also experienced multiple instances of this issue since going to NPM 11.5 and SAM 6.2.0; nodes with unmanage/remanage schedules not changing status correctly, unable to remanage a node through the GUI consistently, and nodes that were remanaged show their SAM monitors still unmanaged and unable to remanage SAM monitors.

    This does not seem to occur on our NPM 11.0.1 instance with SAM 6.1.1

    Needless to say; this is a bug that needs squashing, since it's causing a lot of unwatned alerts during scheduled maintenance and (even worse) SAM monitors NO alerting when critical problems are detected on managed nodes.

    Orion Platform 2015.1.1, SAM 6.2.0, QoE 2.0, IPAM 4.3, NPM 11.5.1, IVIM 2.0.1

  • Dustin, we are still getting this about once every two weeks.  While moving a node to a different poller works other nodes are still are stuck.  For now we have just resigned to cycling the SW services on the affected poller which fixes it for a week or so.

    Orion Platform 2015.1.2, SAM 6.2.1, DPA 9.2.0, QoE 2.0, NCM 7.4, NPM 11.5.2, NTA 4.1.1, SRM 6.1.11, IVIM 2.1.0, VNQM 4.2