6 Replies Latest reply on Aug 19, 2019 4:45 PM by tomiannelli

    Fix agent port on server

    MathieuJM

      Hi,

      We recently run into an issue that we have not imagine:

      we are monitoring couple of SQL cluster with multiple SQL instance. These instance have port setup by the DB team.

      We have deployed SolarWinds agent to monitor the servers. We have setup an Agent initiated communication. The agent communicate to the SolarWinds server on port 17778 from 2 local port which seems to be random high port,

       

      On one of our SQL cluster SQL03 was passive and SQL04 active, the DBA wish to switch the active instance on the SQL03 server for maintenance reason on the SQL04.

      They had a surprise that the port used by the SQL instance was already locked by the SolarWinds agent...

       

      So the question to avoid this is : can we either configure the network port used by the agent on a Agent initiated communication or can we exclude some port to avoid the issue that we encounter recently ?

       

      I have already create a SAM template to check if the Agent is communicating to the SolarWinds server from a reserved port however this is only a kind of workaround for the time being.

       

      Any idea to avoid this situation ?

       

      Cheers

        • Re: Fix agent port on server
          tomiannelli

          A bit confused as the agent uses fixed ports already to listen on and communicate with the Orion server. SolarWinds Orion agent requirements

          The random high source ports are not bound. So it would bind 135, 445, and 17790 to listen for incoming. Outbound would only use a fixed destination port of 17778 to talk to the server.

           

          So your SQL server wanted to bind to one of those three ports? I am not aware of a way to change those ports. But I will look.

           

          What does a netstat -abn show for those SQL servers?

           

           

          Sample of the listening ports on my machine:

          netstat -bna

          Active Connections

            Proto  Local Address          Foreign Address        State

            TCP    0.0.0.0:135            0.0.0.0:0              LISTENING

            RpcSs

          [svchost.exe]

            TCP    0.0.0.0:445            0.0.0.0:0              LISTENING

          Can not obtain ownership information

            TCP    0.0.0.0:3389          0.0.0.0:0              LISTENING

            TermService

          [svchost.exe]

            TCP    0.0.0.0:17775          0.0.0.0:0              LISTENING

          [SolarWinds.ServiceHost.Process.exe]

            TCP    0.0.0.0:17779          0.0.0.0:0              LISTENING

          Can not obtain ownership information

            TCP    0.0.0.0:17780          0.0.0.0:0              LISTENING

          Can not obtain ownership information

            TCP    0.0.0.0:17797          0.0.0.0:0              LISTENING

          [SolarWinds.ServiceHost.Process.exe]

            TCP    127.0.0.1:17775        0.0.0.0:0              LISTENING

          [SolarWinds.ServiceHost.Process.exe]

            TCP    127.0.0.1:17797        0.0.0.0:0              LISTENING

          [SolarWinds.ServiceHost.Process.exe]

           

          netstat -oa

          Active Connections

            Proto  Local Address          Foreign Address        State           PID

            TCP    0.0.0.0:135            UNITLW-EPM:0           LISTENING       1248

          Get-Process -ID 1248

          Handles  NPM(K)    PM(K)      WS(K)     CPU(s)     Id  SI ProcessName                                                         

          -------  ------    -----      -----     ------     --  -- -----------                                                         

             1498      20     9276      16040              1248   0 svchost                                                             

                                                                                                                                                                                                                                                                       

            TCP    0.0.0.0:445            UNITLW-EPM:0           LISTENING       4

          Get-Process -ID 4  

          Handles  NPM(K)    PM(K)      WS(K)     CPU(s)     Id  SI ProcessName                                                         

          -------  ------    -----      -----     ------     --  -- -----------                                                         

             7282       0      200       5424                 4   0 System                                                              

            TCP    0.0.0.0:17775          UNITLW-EPM:0           LISTENING       5104

          Get-Process -ID 5104

          Handles  NPM(K)    PM(K)      WS(K)     CPU(s)     Id  SI ProcessName                                                         

          -------  ------    -----      -----     ------     --  -- -----------                                                         

             5647     196   122756     150900              5104   0 SolarWinds.ServiceHost.Process                                      

            TCP    0.0.0.0:17779          UNITLW-EPM:0           LISTENING       4

          Get-Process -ID 4  

          Handles  NPM(K)    PM(K)      WS(K)     CPU(s)     Id  SI ProcessName                                                         

          -------  ------    -----      -----     ------     --  -- -----------                                                         

             7282       0      200       5424                 4   0 System                                                              

            TCP    0.0.0.0:17780          UNITLW-EPM:0           LISTENING       4

          Get-Process -ID 4  

          Handles  NPM(K)    PM(K)      WS(K)     CPU(s)     Id  SI ProcessName                                                         

          -------  ------    -----      -----     ------     --  -- -----------                                                         

             7282       0      200       5424                 4   0 System                                                              

            TCP    0.0.0.0:17797          UNITLW-EPM:0           LISTENING       5104

          Get-Process -ID 5104

          Handles  NPM(K)    PM(K)      WS(K)     CPU(s)     Id  SI ProcessName                                                         

          -------  ------    -----      -----     ------     --  -- -----------                                                         

             5647     196   122756     150900              5104   0 SolarWinds.ServiceHost.Process                                      

            • Re: Fix agent port on server
              MathieuJM

              We are using the agent initiated communication so that the destination is fixed to 17778 to the SolarWinds server.

              The netstat output for the current running Agent

               

              TCP     local_IP:59256     sw_IP:17778     ESTABLISHED

              [SolarWinds.Agent.Service.exe]

              TCP     local_IP:59257     sw_IP:17778     ESTABLISHED

              [SolarWinds.Agent.Service.exe]

               

              unfortunaly the SQL ports used are: 54046 / 57782 / 57650 / 53162 / 53932 / 55822 / 60945 / 50526 --> as you can see only high TCP port

               

              Last week the SolarWinds Agent was bin locally to port 57782. This was not an issue because the SQL instance was not active on this server but on the 2nd member of the cluster.

              DB team has tried to move the instance and unfortunaly the port was already bind to SolarWinds Agent.

               

              That's why i'm looking to exclude port from Agent local bind or restrict port used by Agent.

               

              Cheers

                • Re: Fix agent port on server
                  jrouviere

                  If you were able to change to Server Initiated Communication Orion would reach out to the agent on port 17790 instead.

                   

                  Agent communication modes

                  • Re: Fix agent port on server
                    tomiannelli

                    This is what I don't get about your question. The source port of the communication is what is known as an ephemeral port [Ephemeral port - Wikipedia ] it is not fixed except for the session/connection. Listening ports are fixed, they have to be. The source ports used vary by operating system but you can control the MaxUserPort on a Windows Server by modifying a registry key.

                    To change the maximum value for ephemeral ports on a computer running Windows do the following:

                    1. Click Start, click Run, type regedit.exe, and then click OK.
                    2. Locate and then click the following registry subkey:HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
                    3. On the Edit menu, point to New, and then click DWORD Value.
                    4. Type MaxUserPort and then press ENTER.
                    5. Double-click the MaxUserPort value, and then type the maximum value in decimal or hexadecimal. You must type a number in the range of 5000-65534 (decimal). Setting this parameter to a value outside of the valid range causes the nearest valid value to be used (5000 or 65534).
                    6. Click OK.
                    7. Quit Registry Editor.

                    The source ports tied to the local_ip will change each time a new session is created. The port numbers will eventually get reused but typically only after the entire pool of ports has been used once.

                     

                    Are those SQL ports you listed the listening ports they have SQL bound to? If they are then setting the MaxUserPort to 49999 would keep any program using the TCP stack from grabbing those ports for a session. But it would never step on the bound ports SQL is listening on.

                     

                    If they are not then something else is really wrong. The operating system should be providing the next available port number for an application making the request. So when a remote client initiates the communication on 1433 to SQL, SQL asks Windows for the next available user port and responds back to the client with that. Then they open the session using those to random high ports. All the processes on the server are going to do that.

                     

                    In this case it sounds like the cluster of the passive node was not managing the random high ports to parallel the active node. When it failed over and the session table tried to keep all the sessions open it failed because the SW Agent had legitimately been using a random high port. If that is the case then I could see where any other process performing TCP communications on the passive node might cause the same problem. Not just the SW agent.

                     

                    If you already knew all this I apologize for the long winded reply.