9 Replies Latest reply on Jul 30, 2013 12:17 PM by cahunt

    After reassigning nodes to different pollers my UnDP stops working

    BryanBecker

      I have to load balance my nodes last night so I moved some of them to different pollers.  After that was complete all of my custom mibs (UnDP) graphs stopped working.  The poller where these are located now has the same SNMP access as the original poller so there is no reason it shouldn't work.

      Just wondering if anyone else has seen this.

      BB

        • Re: After reassigning nodes to different pollers my UnDP stops working
          cahunt

          This happens to random nodes it seems. The UNDPs stop polling, but node and interface status are still going strong. I can change the polling engine of a node and sometimes it will pick up again. Other times it just sits even with a poller change. Even with moving the node to a less populated poller does not bring the UnDP back to life. I see no snmp errors on the switch that indicate an issue, and all other switch stats do show.  CPU/MEMORY/Error & discards/Syslog/Traps.  Also have seen partial polling either at random or on moving a node to a new poller.

          We built out balancing the load, but since this started happening the most problematic poller has about 1/2 of the nodes as the other 2 pollers.

          Though this issue with the UnDP is not related or specifically tied to a single polling engine.

           

          With v10 we have not seen a stopping of services. We did with v9.5, but that would affect all nodes on that polling server.

          • Re: After reassigning nodes to different pollers my UnDP stops working
            rgward

            Are you still experiencing this problem?  Any word from SolarWinds on this?  I just upgraded from 10.3.1 to 10.5 yesterday and now I'm seeing this problem with UnDP pollers just stopping. We have 3 polling engines and UnDP seems to be working well on the primary but is real bad on our additional two polling engines. In cases, where I'm charting the UnDP, I see in many, the data intermittently stops and then may restart hours later.   Many have just stopped altogether.  The polling stats for each polling engine, the UnDP polling rates are 0 or 1%.  Maybe this is a clue many aren't working. 

              • Re: After reassigning nodes to different pollers my UnDP stops working
                cahunt

                This issue still exists for us. Currently our main poller is tied to the same server as the web engine, we have a liscense for another poller and are in the process of setting that up to give the web server it's own space. Solarwinds says this should help the issue, but it is more than just the main poller that stops querying. Some days some nodes are lucky to get 3 or 4 polls. The only saving grace on some issues is we still get a status of the node and interfaces, and we have built out proper traps and trap alerts for extreme conditions.

                 

                As it is, i just moved two nodes from the main engine/web server to an additional poller to try and lighten the load. Last poll on those two boxes was 19 hours ago...try to move those nodes to a less populated poller if you have the room. Older versions we had issues with the actual services stopping or hanging. Once we moved up to 10.1 + things changed. The services stay running, but many nodes are missed in the polling efforts.

                Will update after the new poller is installed and runnning.

                • Re: After reassigning nodes to different pollers my UnDP stops working
                  shuth

                  I've experienced this issue before. A UnDP works on a node on the main poller but not nodes on the additional poller, or if you move the node from the main poller to the additional poller then the UnDP stops working.

                   

                  We were able to resolve it with Step 5 in the following KB article: SolarWinds Knowledge Base :: A Universal Device Poller defined on my Additional Polling Engine is not working. There will be some downtime while your Config Wizard runs.

                   

                    • If the UnDP is still not receiving statistics from your monitored node, the credentials file used to connect the Additional Polling Engine to the Job Engine on your primary server may be corrupt. Complete the following procedure to fix a potentially corrupt Job Engine credentials file:
                      1. Rename the existing Job Engine credentials file, ucdat1.xml, as ucdat1_OLD.xml.
                        Note: For Additional Polling Engines installed on Windows Server 2008, this file is located, by default, in C:\ProgramData\SolarWinds\JobEngine\. For Additional Polling Engines installed on Windows Server 2003, it is located, by default, in C:\Documents and Settings\All Users\Application Data\SolarWinds\JobEngine\.
                      2. Click Start > All Programs > SolarWinds Orion > Configuration and Auto-Discovery > Configuration Wizard.
                      3. Complete the Configuration Wizard using your existing settings. This should reinstall the Job Engine credentials file.
                • Re: After reassigning nodes to different pollers my UnDP stops working
                  PavelSuchy

                  Guys,

                   

                  I would suggest you to open a ticket for that issue because this is suppose to be working properly and the easiest way to solve it is via support ticket.

                   

                  Thanks,

                  Pavel

                  • Re: After reassigning nodes to different pollers my UnDP stops working
                    cahunt

                    Okay, so I called support, and the tech was able to "redo" our additional pollers.... we have the main on left to perform this function set on.

                    As it is, our additional pollers picked right up; if this doesn't work they will escalate it to another tech/engineer with more skillz!

                    But for the most part this should do it.

                     

                    Process follows;

                     

                    You will need to stop the SolarWinds Services on all 3 Servers before performing these steps on the Primary:

                    REPAIR JOB ENGINE
                    - Open C:\Documents and Settings\All Users\Application Data\SolarWinds\Installers.
                    - On a 2008 system, this will be under C:\ProgramData\Solarwinds\Installers.
                    - Run the Jobengine.msi and select Remove.
                    - Run the Jobengine.msi and select Typical install. (Note: the Windows UAC can prevent this installation from occurring. The UAC must be disabled)

                    REPLACE JOB ENGINE V2
                    - Open C:\Documents and Settings\All Users\Application Data\SolarWinds\Installers.
                    - On a 2008 system, this will be under C:\ProgramData\Solarwinds\Installers.
                    - Run the Jobengine.v2.msi and select Remove.
                    - Run the Jobengine.v2.msi and select Typical install.

                    REPLACE INFORMATION SERVICE
                    - Open C:\Documents and Settings\All Users\Application Data\SolarWinds\Installers.
                    - On a 2008 system, this will be under C:\ProgramData\Solarwinds\Installers.
                    - Run the InformationService.msi and select Remove.
                    - Run the InformationService.msi and select Typical install

                    REPLACE COLLECTOR SERVICE
                    - Open C:\Documents and Settings\All Users\Application Data\SolarWinds\Installers.
                    - On a 2008 system, this will be under C:\ProgramData\Solarwinds\Installers.
                    - Run the CollectorInstaller.msi and select remove.
                    - Run the CollectorInstaller.msi and select Typical install

                     

                    Cheers!