9 Replies Latest reply on Apr 28, 2011 11:43 AM by topside844

    Orion ICMP - Destination Unreachable - Protocol Unreachable

    rpage2655

      I would like to start off saying I am not the Orion admin.  I am the network engineer.  I hope I can provide enough information to help solve the problem we are having. 

      We are use Orion to monitor our branch offices connected by a MPLS network.  Each site has a T1.  Some have secondary connections.  We monitor routers, switches, and servers at each office.  From what I have been told, we poll the router more often than other device so we can detect the router down or in trouble, hopefully before the branch office knows there is an issue. 

      My basic understanding of Orion is Orion uses ICMP pings to determine if the router is up.  Orion will mark the router in a warning state if we miss a couple of pings. I understand this may seem too quick, but this is what we have to work under.

      The Orion admin has written a script that will do trace routes from the Windows server (to the router, switch, and branch server) if the router enters a warning state.  The problem is we see way too many of these trace files.  I can correlate the time on the trace routes to packet captures.  Each time a trace is generated I see in the packet capture:

      The Orion server sends an ICMP Echo. 

      The router will respond with an ICMP Echo Reply. 

      Almost immediately the Orion server will respond with an ICMP Destination Unreachable, Protocol Unreachable.  This corresponds to an ICMP Type 3, Code 2. 

      Most of the ICMP traffic between Orion and the router is normal:  Echo->Echo Reply.

      There is no apparent problem on the network.  We run voice and many other applications across the network.  No issues.  What do we need to look at?  Why would the Orion server respond with the "protocol unreachable"?  I would appreciate another take on this problem.

        • Re: Orion ICMP - Destination Unreachable - Protocol Unreachable
          neilmborilla


          Why would the Orion server respond with the "protocol unreachable"?  I would appreciate another take on this problem.

           

           



          Maybe because the ping timed out?

          • Re: Orion ICMP - Destination Unreachable - Protocol Unreachable
            bobdawonderweasel

            Where is the packet capture happening?  At the Orion server or some point in between?

              • Re: Orion ICMP - Destination Unreachable - Protocol Unreachable
                rpage2655

                We have looked in two different areas. 

                We have a tap installed between our core switch and the MPLS router.  This allows us to capture data flowing between the core and the router.  This is the traffic moving to and from the central office and the branch offices.

                We have also captured the same data packets on the core switch. 

                The Orion server is plugged into the core switch.

                So, to make a long store short, we see the same results at the core and as the packets flow between the core to the MPLS router out to the sites.

              • Re: Orion ICMP - Destination Unreachable - Protocol Unreachable
                pstewart726

                I would very much like to hear how you make out with this....

                 

                I have opened several tickets with Solarwinds on a similar issue.  They have told me that it's my Windows 2008 OS that's the issue but I'm having a hard time believing that it is.  We have an issue where once every couple of weeks, two certain Juniper switches go into alarm in Solarwinds....

                Each time, we can ping and reach these switches with no issues.  I know this sounds like a network issue on the surface but I can guarantee it's not.  These switches are totally independent of one another and not even in the same geographic region of the world. 

                When I login to the Solarwinds server, I cannot ping the device from that server but if I check from any other location in our network the switch is reachable.  Yes, I've opened tickets with Juniper and a trace on their switches shows the ICMP echo request not even arriving at their switches.

                I'm writing this because I was just paged about this incident.  The usual fix is to reboot the server and the alarm clears (which it just did now).

                I can appreciate what Solarwinds is saying about this being an OS issue but we completely rebuilt the box with a brand new fresh copy of Windows 2008 and it continues. 

                Paul

                • Re: Orion ICMP - Destination Unreachable - Protocol Unreachable
                  topside844

                  It appears this may be an issue with Windows Server 2008 R2.

                  I'm having this same exact problem with Nimsoft's net_connect probe which sends out 3 immediate pings, then receives 3 pings back from the remote host.

                  Intermittently, I'm seeing my Nimsoft server send out a ICMP Type 3, Code 2: Protocol Unreachable. When I check the headers of the "offending packet" (the ICMP Ping Reply), it has an IP Protocol Type of 1 (ICMP) which is correct and expected. For some unknown reason, the TCP/IP stack on the server is intermittently not able to process the ICMP reply request (received too fast?) and sends out a Protocol Unreachable message. But, this is NOT generated by the application (Solarwinds / Nimsoft) based on my knowledge of the TCP/IP stack. It's the kernel's TCP/IP stack that generates these messages.

                   

                  Now, once this "ICMP Protocol Unreachable" message is sent out the remote host, it's up to the remote end to decide how to proceed. In most cases, additional pings are still replied to. However, if the customer has some type of security appliance such as an Intrusion Prevention System, many have signatures that prevent additional traffic from being sent after an ICMP Protocol Unreachable message has been received. This is what's happening to me for some customers. Once the ICMP Protocol Unreachable is sent to the remote host after processing it's ping reply, I receive no further ping replys to my requests. (I can however still telnet / http / https to these devices...)

                  We're still investigating what is actually preventing the ICMP replys from being sent (IPS, ASA, or Host OS) after the ICMP Type 3, Code 2 is received. However, I've come up with a simple bandaid to the problem that will allow monitoring to be unaffected.

                  Just block ICMP protocol unreachable messages from your Orion / Nimsoft server that is sending out pings. You can do this with the Windows Firewall pretty easily. If you have a Cisco router, you can apply the following access-list to your interface/vlan attached to your monitoring server. 

                   

                  access-list 101 deny   icmp host 10.10.254.254 any unreachable

                  access-list 101 permit ip any any

                  int Fa0/0.254

                  ip access-group 101 in

                   

                  You can see below how many it's catching after a 24-48hr period.

                  Extended IP access list 198

                      10 deny icmp host 10.255.254.211 any unreachable (27340 matches)

                      20 permit ip any any (5944388 matches)

                  Since I implemented this bandaid, no further hosts stop responding to pings.