This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Initial ping (discover) works.. then fails. OK from Orion host OS

Hi all,

Does Orion NPM do anything special with their ICMP polling?  We are monitoring remote locations over MPLS and everything is working.... except one device.  The device initially discovers and the initial ICMP only poll works.... then it stops working.  Pinging the same IP from the Win2008 R2 that Orion runs on works fine.  Netflow on the router the device is connected to show traffic in and out when pinging from the OS.  It only shows traffic in when attempting a poll from Orion???

  I initially thought it was IPS on the device we are polling that was denying the echo-reply back to our orion host.  However pinging from the OS works so I am not so sure.  That is, unless Orion doesn't use just a standard echo when doing its polling?  We have confirmed the issue is not related to packet size or TTL.  Any ideas why Orion could initial discover/ping and pings from the OS work, however, NPM polls are not responded to by the device?

  We are able to monitor the switch port the device is connected to and it shows as up...  The only anomaly I can see is that the connection to the switch is half-duplex?  Is that a clue at all?

  Any ideas appreciated.  Thanks in advance.

--

Colin

Orion Core 2012.1.1, SAM 5.2.0 SP1, NPM 10.3.1, SEUM 1.5.1, IVIM 1.4.1

  • Did you ever resolve this issue? I have the exact same problem with about 10 taclane encryptors. I can ping the devices from my server (2k8 r2) and the initial ping from Orion works then it times out. I have nothing in my logs indicating an ACL blocking it. Plus! the fact that I can ping it from my server?????

  • When you ping manually, what is the response time? Could be a timeout issue which can be set from the polling settings

  • it is anywhere between 1-8 ms and it's a 32 byte ping. I have played around with the ping packet and set it everywhere from 0 to 32 bytes and as aggressive as 10 second polling to 4 minute polling. The device in question is a taclane however, I can ping it from anywhere in the network and my firewall guys are pretty cool so my Solarwinds server is wide open. (no external connectivity though)

  • We never did find a solution and simply stopped monitoring the device but instead watch the port on the switch that the device is connected to.  We didn't hook up wireshark or other protocol analyzer but assume that there is something different in the way that the poller packet is structured vs packets on initial discovery and or packets from the host OS.

    This issue was over a year ago for us and we have gone through a couple Orion upgrades so will test again with our latest version and update this ticket as to how it goes.  Thanks for the reminder...

    If the issue is still occuring, my underlying question would be, what is different in ICMP sent by the poller vs the ICMP that is sent on initial discovery?  As mentioned, when it doesn't work, we see the traffic going to the endpoint device but not returning....  So something in the poller packet seems to confuse the endpoint device to the point where it won't even respond?

  • Agreed. I have nothing in my logs showing icmp being blocked and like I said I can ping it from the server itself all day long so something has to be screwy. I'll keep digging and see if I can pin point it. To date I've added a lot of devices that I didn't think I would be able to so maybe I'll get lucky! Thanks for the heads up.

  • I have retested this morning and behaviour is the same.  Initial discover and pings from the host OS are successful.  Netflow shows traffic in both directions.  Subsequent polls are unsuccessful and Netflow shows traffic only going to the device.  Nothing coming back from the device.

    There must be some sort of a difference in the way the ICMP packet is built by the ICMP only poller vs initial discovery and how the host OS builds its ICMP echo packets?  If we ever find the time, we might be able to connect wireshark to our test environment to see just what that difference is.  In the meantime, we will continue to just monitor the switch port the device is connected to.

    Please update this ticket if you find anything more concrete as to why this behaviour exists.

  • Oh, current version of Orion etc...:

    Orion Platform 2013.2.1, SAM 6.0.2, NPM 10.6.1, WPM 2.0.1, IVIM 1.9.0

  • I talked with VIASAT about the issue about an hour ago and the settings I have on the KG250 are correct and we even played with the ICMP packet size. One thing I realized though is that this connection is tunneled (encryption device) across a juniper SSG that is tunneled in through an ASA. Soooo I looked on my ASA and sure enough there's an error message there from the exact same times that I was try9ing it but I have no idea what it means. I kind of might know but it's too early to tell. Is your connection going through an ASA at all?

  • We are connected to ISR 2911 on the border, connected from there to 2960 Switch and then to device.  I see the traffic leave the ISR but never comes back.  There are no logs on either Switch nor ISR to indicate it was blocked/denied.  My assumption is that it is getting to the endpoint but something in the crafting of the ICMP echo is causing the endpoint to discard it....