This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Talk to me about a router's role in UDT operation, please?

I've had UDT for a year.  It looked quite promising, but it's turned out otherwise and I've had to shut it off due to it not providing timely and accurate information, and due to it causing so many snmp-GETs that it results in a DOS on 6509 Distribution switches.

Challenges:

  • Most of my access switches were Cisco 2960S models.  Their local memory and CPU resources proved drastically underpowered to accommodate our deployment of Cisco ISE and SW UDT.  The switches were so overwhelmed with all the SNMP queries and ISE activity that I had to keep pushing my UDT polling schedule further and further away from the default--which was already a longer time than tracking wireless users and their devices needs.  After setting the polling cycles so far apart it became obvious UDT wasn't useful for tracking laptops or wireless devices.  I've spent the last year doing forklift upgrades of the 2960S models, replacing them with 8350's and 9348's.  I'm hoping to start up UDT again on those units, along with polling my 2960X's and 2960XR's and 4510's, but I'm not confident the experience will be a positive one.
  • UDT isn't compatible with SNMP-v3.  All of my systems were happily running on snmp-v3 and none of them were providing UDT information because UDT can't work with encrypted v3 data.  I had to reconfigure all my access switches, distribution switches, wireless controllers, layer 3 switches, and routers to use snmp-v2.  Company security policy mandated that ACL's be applied to all of those nodes that restricted snmp-v2 access to my pollers and my team's workstations.  That was loads of work I didn't need, and wouldn't have had to go through if UDT worked with snmp-v3 community strings.
  • Router or L3 node changes.  I regularly replace aging distribution switches and routers with newer ones, and that's proven an Achilles heal for UDT.  I've not yet found a place in its documentation that defines how it interacts with routers & layer three distribution switches, and I've hoped that simply monitoring those nodes with NPM would suffice to provide UDT with the ARP information it requires to track devices on the access switches below the L3 equipment.  But when I replaced a VSS pair of Cisco 6509's with a VSS pair of 6807's, ensuring that both the old AND the new equipment was monitored in NPM, I found UDT data about switches below those devices quickly became stale.  It seems that UDT lost where to look for the ARP information when the routing interfaces moved from the 6509's to the 6807's.  I'm not sure why since both sets of nodes were monitored in NPM.  Should I have also added the 6807's to UDT's monitoring?  I don't think that's necessary since no access devices are using any ports on the 6807's.  Those VSS chasses only serve as the L3 routing hardware for many L2 access switches below them.  Wouldn't UDT automatically know that the 6807's contained the data for the thousands of computers attached to the access switches that uplink to the 6807's?
  • User tracking.  I've had political challenges getting UDT access into the AD world, resulting in an inability to track users by login name.  Yes, it's a local problem, not SolarWinds' issue, but it seems like it would be an intuitive step to let UDT automatically access everything it needs in the Active Directory servers without requiring large amounts of input and actions by the AD administrators.

Since my initial deployment of UDT over a year ago I've replaced my SolarWinds server and database infrastructure by moving all to 2016 versions.  I've also upgraded to the latest Solarwinds Orion Platform 2018.4 and NAM 2018.4 and i look forward to turning UDT back on with better results due to the upgrades.  But I'm not certain this will be the case.

So let's get back to my original question about routers, which also pertains to L3 switches that do routing or function as distribution switches to the VLANs on the switches below them.

  1. Does UDT require a router be defined for every VLAN?
    1. If so, where does one tell UDT about the routers serving each switch?
    2. If UDT doesn't require a specifically defined router or L3 switch for the VLAN's on access switches, how does UDT know which nodes hold the ARP info for every VLAN on every switch?
    3. Where does UDT go to understand which L3 node should be queried for any PC or workstation being routed by that L3 node?
  2. SNMP-v2 or v3 for routers and L3 Distribution switches:
    1. Does UDT require snmp-v2 access to every router?
    2. Can UDT get its device and user tracking jobs done via snmp-v3 community strings for routers & L3 switches?
  3. What can be done to reduce the strain UDT puts on switches or routers with CPU and/or memory resources that should be sufficient to UDT's needs--but that aren't?
    1. I discovered my 6509 Distribution Switches acting like they were experiencing an snmp-get DOS attack from all the queries they received from UDT and other network node discovery tools (e.g.: like Printuition, which polls all IP addresses on the network to discover printers.  We have ~8,000 network printers, and keeping up with their paper and toner and service needs requires an automated tool like Printuition).  When I replaced the 6509's with 6807's I saw the newer L3 distribution switches could handle the snmp query onslaught better than the 6509's.
    2. How does one ensure that any L3 switch or router isn't overwhelmed with snmp-get requests from UDT while simultaneously ensuring a current and valid UDT database remains available for my staff to query and watch and manage?
  4. What are the results of enabling UDT to manage ports on a switch for which UDT is not specifically set to monitor the upstream router or L3 switch?
    1. Will UDT simply understand the next upstream router from an access switch will hold the ARP info for PC's & printers, and query that upstream router via the snmp-v3 strings used by NPM to monitor that router?
  5. More and more of my remote sites rely on non-traditional routers for WAN connectivity.  In some cases they use Cisco 5506's for BGP WAN routing, and those same ASA's provide local routing for computers at the side.  In other cases 8350 or 9348 switches have L3 services enabled and are functioning as the routers for nodes below them.
    1. Will UDT accurately discover nodes beneath an ASA 5506 that's working as a router?
    2. Will UDT accurately discover nodes below a 3850 or 9348 switch with L3 services enabled to make it a router for the switches and computers below it?
  6. A number of sites rely on L3 routing & distribution to come from Cisco Nexus 5548 switches.  I've had challenges with Nexus playing well with Solarwinds.  Will UDT be able to properly function if routing has to come via Nexus 5548's?

I need to get my investment back from purchasing UDT, but my team is moving to other best of breed solutions that don't stress our network switches or routers, and that do what we expected UDT to do--at a lower price point.

All honest and helpful attempts to assist will be appreciated. I'm betting that if marcoswithanoh​ can't help me, he knows who can, and he'll draw their attention to this query.

Sincerely,

Rick Schroeder

  • Hi Rick,

    I would like to try to answer your questions.

    Maybe one or the other point will help you.

    1.

    The smallest common denominator is the MAC address.

    The following post explains how a MAC address becomes the host name and what information SolarWinds needs to query for the corresponding layers:

    https://support.solarwinds.com/Success_Center/User_Device_Tracker_(UDT)/Knowledgebase_Articles/UDT_gathers_hostname_information

    The prerequisite is always that the corresponding MIBs are enabled at the nodes for the SNMP queries and the SolarWinds server from which the queries are performed also has the necessary DNS resolutions.

    Whether the MIBs are unlocked can be checked with the UDT Checker:

    https://support.solarwinds.com/Success_Center/User_Device_Tracker_(UDT)/Knowledgebase_Articles/UDT_Compatibility_Checker

    It is also possible that for one L2 node, multiple L3 nodes will provide the ARP information because VLAN100 is on Router 1 and VLAN200 is on Router 2.

    This can be seen very nicely in the SolarWinds demo infrastructure.

    Here you can look at the IP information about different L3 nodes.

    https://oriondemo.solarwinds.com/Orion/DetachResource.aspx?ResourceID=1727&NetObject=N%3a98&currentUrl=aHR0cHM6Ly9vcmlvbmRlbW8uc29sYXJ3aW5kcy5jb20vT3Jpb24vTmV0UGVyZk1vbi9Ob2RlRGV0YWlscy5hc3B4P05ldE9iamVjdD1OOjk4

    2.

    The answer to question 2 is also somewhat related to the answer to question 1.

    Since the L3 information always comes from an L3 node, this must also be queried.

    For that you can use snmpv2c as well as snmpv3, both works.

    I only had to consider the following with snmpv3:

    https://support.solarwinds.com/Success_Center/User_Device_Tracker_(UDT)/Knowledgebase_Articles/Layer_2_Layer_3_data_not_visible_when_using_SNMPv3?CMPSource=THW&CMP=DIRECT

    The following must be entered in the snmpv3 configuration on the node to be queried:

    snmp-server group <YourGroupName> v3 priv context vlan- match prefix

    3.

    I can not say much about the performance issues.

    probably our infrastructure is not that big;)

    or we had newer IOS versions or new hardware when memory / CPU issues were encountered

    4.

    no, both nodes, both L2 and L3 device (eg switch / router) need to be in monitoring both via NPM and via UDT.

    Of course, in the NPM you only include the performance and in the UDT all L2 / L3 relevant ports are included in the monitoring

    5.

        1. Unfortunately, I do not know, because I have no ASA as a router in use and must query per UDT

        2. yes, in the SolarWinds demo you can see a 3850 that provides both L2 and L3 information

            https://oriondemo.solarwinds.com/Orion/NetPerfMon/NodeDetails.aspx?NetObject=N:98

    6.

    Attached is a 5548 node from the SolarWinds Demo Lab, which also provides L3 information in the UDT:

    https://oriondemo.solarwinds.com/Orion/NetPerfMon/NodeDetails.aspx?NetObject=N:221

    Please excuse my english, unfortunately I am not a native speaker emoticons_wink.png

    Kind regards

    Rene

  • Thank you, renek​, for your thoughts on this matter.  I'm reviewing them in sequence and determining whether my environment and your ideas match well.

  • rschroeder, UDT 3.3.2, which is based on Orion Platform 2018.4, contains a bunch of fixes to reduce the amount of SNMP requests. Also, UDT supports SNMPv3.

    L3 information from Cisco ASA devices can be got if you provide CLI credentials for the device. Nexus devices are not supported in UDT 3.3.2 and older

  • Rick, I don't have an answer, but we are new SW shop - lots of the same Cisco hardware as you - and are about to go into the UDT phase of the SW deployment. Soon we will have the same headaches.

    You mentioned this: " I need to get my investment back from purchasing UDT, but my team is moving to other best of breed solutions that don't stress our network switches or routers, and that do what we expected UDT to do--at a lower price point."

    Can you share some of those best-of-breed UDT solutions? Thanks!

  • rschroeder

    I can't address each of the above, but I can tell you I've worked in environments with and without UDT.
    I first I thought UDT was a must, but have learned that it can be quicker to find an asset or user with it, but not required.
    In my current network we don't have UDT, and tons of dissimilar HP/Aruba/H3E (all HP but command sets are nothing alike).
    For user tracking, the most recent logged in user is added to the AD computer object description. While there are multiple ways to accomplish this, here is a free way to do that.Updating Domain Computer Objects with User and Machine Information | PeteNetLive
    MAC tracking is a little harder

    In my last environment we did have UDT and it was nice for quickly finding a suspect (potential rogue system), but normally old reported down to the L3 SVI, not the actual switch they were on, so not as useful as we needed it to be.
    We also had a huge WAN, some with really high latency, and we had no noticed issues with DDoS type SNMP-Gets.
    We used SNMPV3, and I had several 6509's, some in a VSS configuration, some stand alone, and they had no issues at all, and our pollers were across a WAN link on top of everything else.
    If it's really still an issue, you could potentially look at implementing Control Plane Policies on those devices that support them, and I guess just have specific rules to drop excessive SNMP, but would look at the server to see if UDT settings can be changed(sorry its been 6 months since I had UDT, so can't look to see if thats an option)
    However we also implemented port security sticky MAC's; sometimes UDT could not find the MAC, but I could go to NCM and do an advanced search of all the configs for the MAC, not as fast as UDT, but less than a minute to scan all current and prior switch configs to find the MAC. As for the users, I would query Splunk, which received all our windows logs to find a PC a user logged into if needed.

    Additionally for tracking the MAC if you know the L3 network it is on, you can use traceroute mac.

    So I  have network documentation product my company bought before I joined them, I'm not a fan, but since they spent so much money on it, I must utilize it so they can see a return on investment, I wish they would just lick their wounds and scrap it, which is what it sounds like your organization wants to do with UDT.

    I know the above may not be what you want to hear, and as much as I like SolarWinds, UDT is only an OK product.

    Best of luck to you

    Tom

  • I'm happy to report I have all my answers and UDT is working as desired now.   The short answer is that getting the most recent upgraded version of NAM (which includes UDT), AND also getting the latest Hot Fix for it, AND including all L3 / Routing devices in UDT has solved the issues I was experiencing.  UDT is reliable now.

    Here are the issues and their answers, in order, from above:

    Does UDT require a router be defined for every VLAN?  YES.

    Where does one tell UDT about the routers serving each switch?  Add routers or L3 devices to UDT in the same place that you add switches.  Either Edit the node in NPM and select the options here:

    pastedImage_3.png

    or go to Settings > UDT, then use the dropdown to Filter to: UDT Unmonitored Nodes, select the nodes, and click the option to MONITOR NODE WITH UDT

    pastedImage_8.png

    If UDT doesn't require a specifically defined router or L3 switch for the VLAN's on access switches, how does UDT know which nodes hold the ARP info for every VLAN on every switch?

    UDT DOES require monitoring all L3 switches & routers that contain ARP info for any switch's end devices.

    Where does UDT go to understand which L3 node should be queried for any PC or workstation being routed by that L3 node?

    UDT polls the L3 switch or router for ARP info and compares it to the MAC addresses learned by UDT from the switches.

    SNMP-v2 or v3 for routers and L3 Distribution switches:

         Does UDT require snmp-v2 access to every router?  UDT requires access.  It may not need to be v2 access--I've read that UDT is now snmp-v3.  It wasn't compatible when I started using it, and I had to create new v2 strings for all devices.  Then InfoSec required me to create ACL's for all devices using snmp-v2 that restricted access to just my APE's.

         Can UDT get its device and user tracking jobs done via snmp-v3 community strings for routers & L3 switches?

         YES

    What can be done to reduce the strain UDT puts on switches or routers with CPU and/or memory resources that should be sufficient to UDT's needs--but that aren't?

    I discovered my 6509 Distribution Switches acting like they were experiencing an snmp-get DOS attack from all the queries they received from UDT and other network node discovery tools (e.g.: like Printuition, which polls all IP addresses on the network to discover printers.  We have ~8,000 network printers, and keeping up with their paper and toner and service needs requires an automated tool like Printuition).  When I replaced the 6509's with newer L3 distribution switches could handle the snmp query onslaught better than the 6509's.  This, combined with the newest version of UDT, AND with the latest Hot Fixes, resolved the DDOS-like issues.

    How does one ensure that any L3 switch or router isn't overwhelmed with snmp-get requests from UDT while simultaneously ensuring a current and valid UDT database remains available for my staff to query and watch and manage?  Use ACL's to restrict access, limit the number of devices polling the L3 systems, and ensure the best, most-current apps & hot fixes are used.  When all else fails, upgrade the hardware.  It worked for me.

    What are the results of enabling UDT to manage ports on a switch for which UDT is not specifically set to monitor the upstream router or L3 switch?

    Not good ones.  In my experience UDT requires the upstream L3 ARP information to work properly.  Without it, MAC addresses showed up at L2 switches to which they'd never been connected.  The data was incorrect; as a result we stopped believing UDT was a good product--when in fact the problem was we weren't monitoring the right devices.  It's what comes when you don't get proper training to set up things correctly.

    Will UDT simply understand the next upstream router from an access switch will hold the ARP info for PC's & printers, and query that upstream router via the snmp-v3 strings used by NPM to monitor that router? No.  UDT has to be told to monitor routers.  It doesn't seem to take any info from NPM's setup to gather that UDT info.

    More and more of my remote sites rely on non-traditional routers for WAN connectivity.  In some cases they use Cisco 5506's for BGP WAN routing, and those same ASA's provide local routing for computers at the side.  In other cases 8350 or 9348 switches have L3 services enabled and are functioning as the routers for nodes below them.

    Will UDT accurately discover nodes beneath an ASA 5506 that's working as a router?  I haven't tested this yet, but I suspect it will.

    Will UDT accurately discover nodes below a 3850 or 9348 switch with L3 services enabled to make it a router for the switches and computers below it?

    Yes--but you have to select the option to poll the same device for L2 AND L3 information from the same node.

    A number of sites rely on L3 routing & distribution to come from Cisco Nexus 5548 switches.  I've had challenges with Nexus playing well with Solarwinds.  Will UDT be able to properly function if routing has to come via Nexus 5548's?  We're retiring our Nexus 5548's; I won't be able to test this.