9 Replies Latest reply on Apr 26, 2016 8:09 AM by rschroeder

    NTA Deployment

    fstroud

      In a hub and spoke deployment, is it necessary to add netflow export at the spoke and the hub or is spoke adequate?

      In a deployment with netflow on at hub and spokes, would the traffic be counted twice by NTA when comparing traffic volumes?

       

      Regards

      Frankie

        • Re: NTA Deployment
          Parker Robinson

          You should put netflow on the hub device since all traffic will have to flow through the hub.  Now, if the hub device is stressed and there is high cpu/memory utilization on it already, you may not want to overwhelm it.  In that case, you could enable netflow on the spoke devices. It would not serve you any purpose to enable netflow at both the hub and spoke since you will be getting the same data @ each end.  Keep in mind if you enable netflow on the spoke, you will only be getting traffic statistics for that one spoke area of the network. 

           

          Hope this answers your question. 

          1 of 1 people found this helpful
          • Re: NTA Deployment
            rschroeder

            The correct answer depends on your environment and your Netflow needs.

             

            If all of your traffic passes through a Netflow-compatible device (let's say it's a router), then that's the only place you may need Netflow enabled.  Routers are often positioned ideally to capture the traffic you want in a hub & spoke environment.

             

            However, if you have a lot of traffic between devices on the same Layer 2 VLAN, that traffic won't hit your router.  Unfortunately, many switches don't support Netflow unless they're Layer 3 compatible, essentially able to perform routing.

             

            To put it another way, if any of the traffic you want to view through Netflow stays within a switch on the same VLAN, and if that switch is not able to send Netflow information, then you're out of luck getting NTA to show you those flows.

             

            In that situation you should be able to configure a SPAN or Monitor session between one or more of the switch's ports of interest, and capture the traffic and send it to a mirrored port on which you'd attach a sniffer.  A simple laptop running Wireshark works well as a sniffer, but there are some other options, too:

             

            • In my environment I have an older PC that was originally destined for recycling.  I had my Desktop Support folks put two additional NIC's in it and now I can set it in any network room to be a long-term sniffer for spanned/mirrored/monitored ports.  The extra NIC's are particularly useful for remotely managing the sniffer PC during a monitor session, and let me copy off Wireshark files to share with Support without having to break the mirror.
            • Depending on your setup, if the sniffer PC doesn't have a second NIC for managing the PC, you'll probably lose the ability to control it once you start up the Monitor/Span/Mirror session.  But don't worry--that's not a problem if you do things in the right order.   First, start Wireshark with the appropriate parameters and options, then enable the mirror on the switch ports, and Wireshark will capture the data you're looking for, even though you lose communications with the sniffer.  Once you know that the interesting traffic has passed through the switch ports, disable the mirror/sniffer session on the switch port going to the Wireshark PC and you'll be able to remotely access/manage the sniffer PC or laptop again.
            • For long-term traffic monitoring on switches that don't support Netflow, you can also purchase a permanent network tap to allow you to do off-device monitoring and packet captures.  Network taps are particularly useful in my environment on connections going to firewalls--for additional services like content filtering and IPS/IDS that our Security team manages.  But you'll find taps helpful to the network team, too, if you have need for long-term packet analysis of switches' traffic when the switches can't support Netflow.
              • Taps come in many brands and models and port capacities.  While you might think a tap could be had for something in the $50 range, that's not the case.  I've used Net Scout and Datacom Systems taps.  None are cheap, and the ones I've bought these year, although pretty basic, still run around $1,600 each.  They're 4-port models with dual power supplies and rack mounts, and are pretty flexible in their configuration.  Fortunately I only need to set them up to mirror all traffic in/out of ports 1 and 2 over to port 3.  Port 4 is left for future sniffing needs.
              • Taps are relatively basic, but the more ports they have, and the higher their throughput  the more expensive they are.
              • 10 gig taps are easy to buy if you have deep pockets, and I expect 40 G and 100 G taps are out there, too.
              • Taps are the basic tools, and you can upgrade from them to rack mounted appliances that do the sniffer and captures and analysis for you.  I've seen models with extreme power and bells and whistles--they can run into six figures for price.
            • The reason you'd buy/install a tap, instead of running a mirror/span/monitor configuration on your switch, is to prevent the switch's CPU from being taxed by mirroring all that traffic to another port.  Cisco has told me that they support mirrored or spanned ports for packet captures only on a short term basis--just for the time it takes to troubleshoot an issue.  Once the issue is identified, disable the mirror.  They advised me against leaving a port mirror up for long-term, since it had the potential to cause enough CPU demand that other ports' traffic could be negatively impacted.  I'd hope they provide more robust hardware and IOS in the future to prevent this, but if they do, no doubt it will come at a premium to buyers.

             

            Let us know how your situation ends up working for you.  There are many folks on Thwack with amazing expertise in all the Orion modules, and particularly in NTA.  We all learn from trying and sharing and helping others.

             

            Swift packets!

             

            Rick S.

            1 of 1 people found this helpful
              • Re: NTA Deployment
                Parker Robinson

                Thank you, Rick for the detailed and very useful information!  One thing that came to my mind when reading about packet sniffing and wireshark was SolarWinds Quality of Experience monitoring.  It is a add-on dashboard included in Orion that uses packet analysis sensors(unlimited and free) which perform deep packet inspection and analysis much like Wireshark does.  It will bring back the following metrics in charts:  Application Response Time, Network Response Time, Data Volume, and # of Transactions.  You can compare the Application Res. Time vs Network Res. Time for a specific application, and therefore is a good starting point for troubleshooting application performance.  Albeit, and of course, Wireshark returns MUCH more detailed information, QoE will do the analyzing for you. 

                It was clutch for me as a Network Analyst when I had the Dev. team blaming the network(without evidence of course) for their slow applications and/or databases.  I used an old server with two NICs that was going to be decommissioned anyway, as my dedicated QoE machine.  One NIC was for management and the other I would plug into the mirror port(Brocade uses mirror ports similar to SPAN) that had the traffic I was interested in.  I would let it run for a few hours/days and bring the results to management, and in my case(s) the application response time was much higher(and network rt was healthy), and therefore we need to focus on troubleshooting the Server and Application, not the network. 

                1 of 1 people found this helpful
                  • Re: NTA Deployment
                    fstroud

                    Thanks to all for the input and detailed responses.

                    Some things for me to think about.

                    1. SPAN/Mirror ports and potential CPU related issues v Taps

                    2. QoE v Wireshark for Analysis v DPI

                    3. Location of netflow.  Hub v Spoke.

                     

                    For my layer 2 location I am thinking of taps due to high volume of 10G interfaces and will look to recommend sending to an analysis tool such as Steel Central.

                    Most of the traffic (although not all) is hub <-> spoke.  I currently have netflow turned on spoke and hub so will look to turn off at spokes.  I can still get details at the spoke as I have steelheads with top 10 talkers enabled.

                     

                    I am interested to hear opinions on the above approach.

                     

                    Regards,

                    Frankie

                    • Re: NTA Deployment
                      rschroeder

                      I've used QoE for some time now, and I find it particularly satisfying to be able to help eliminate thoughts that "the network" is having problems, when I can look at QoE and see fast response times to servers, but database or html responses showing huge lags in latency.

                       

                      However, with all the good info QoE provides, I continue having problems breaking through the silo into the SA's or DBA's realm to share this info with them.  Getting other groups to adopt and learn QoE (or any Orion modules at all) is a seemingly impossible battle to win.

                       

                      Perhaps the issue is that there have been many monitoring programs adopted over the years--some at high dollar cost--and as each one comes in with its new advocate, it also goes out when that employee leaves.

                       

                      Maybe I'd be better taking my presentation to these groups' Managers, and let it trickle down from on high.

                      1 of 1 people found this helpful
                    • Re: NTA Deployment
                      fudgit2016

                      An alternative to a commercial network tap, is to use a SPAN port with an "nprobe" enabled server.

                       

                      With a decent Linux OS (and custom drivers as per the nprobe documentation) you can definitely scale up to 10Gbps interfaces (haven't tried going beyond that - in theory doable, but ensuring no packet drops in the pfring drivers could be interesting).

                       

                      Only drawback to nprobe is the technical Linux knowledge which can be needed to install (depending on the flavour of Linux) - which is where the appliance route can be advantageous.  Need to consider the impact on scaling multiple SPAN ports @ 10Gbps on the switch(es) you're monitoring though - not all switches are up to the job.

                      1 of 1 people found this helpful
                        • Re: NTA Deployment
                          rschroeder

                          You may see, higher up in the thread, I covered SPAN sessions and when they are appropriate or inappropriate.  Cisco provides SPAN for temporary troubleshooting, and recommends a physical tap for long term monitoring, due to the problems they've seen with resource utilization associated with the SPAN session.

                            • Re: NTA Deployment
                              fudgit2016

                              Fully agree that switch performance and capabilities have to be reviewed to ensure support - however it is an alternative to a potentially high cost tap (which can be disruptive in their own right).

                               

                              Are there any documented advisories as the recommendation you received from Cisco, I haven't seen this before - and over the years have used (sometimes with Cisco's recommendation) SPAN ports on the same chassis regularly (not used RSPAN though).  There are lots of potential constraints both on card and on chassis, so impact to be determined on a case by case basis.

                              1 of 1 people found this helpful
                                • Re: NTA Deployment
                                  rschroeder

                                  I'm not aware of published Cisco advisories specifically saying how long they deem a SPAN port should be supported.  I've been told by TAC not to do it more than a day, but finding an official "Don't do this for longer than . . ." is probably something Cisco won't publish if they don't want to appear to have limitations.

                                   

                                  One web site says that, depending on the switch you're working with, you should never see a SPAN port cause any impact on the switch's resources.

                                   

                                  Other pages from Cisco, especially when talking about large switches (Nexus 7K's in this case), can support up to two SPAN sessions, but no more than that.

                                   

                                  Smaller switches (e.g.: 2960 series) cannot support more than one SPAN session.

                                   

                                  A common Cisco SPAN caveat is port oversubscription, which can result in dropped packets.

                                   

                                  We've standardized on physical taps for our long-term security needs, and have had great satisfaction with them.  They're flexible enough to allow multiple configurations for their port configurations, and I've been able to find them in all speeds and port styles (up to 10G, and copper or fiber--SM or MM--in rack mounted dual-power supplied styles with multiple sets of ports).  There's never a worry about dropping packets due to oversubscription, and there's never a concern that someone could accidentally or intentionally reconfigure a SPAN port, which might result in lost data.

                                   

                                  We have budget for physical taps to fulfil the practice of keeping it simple, keeping it reliable, and removing operator (or hacker) errors.