7 Replies Latest reply on Jan 19, 2009 8:09 PM by Debbi

    Netflow can take 3 minutes to display

    Debbi

      I would like to know if this is normal.  I can open a support ticket, but value the community's experiences too since many of you have the same things happen as I do.


      It can take over 3 minutes for an interface page to fully display on the netflow web.  This has been true ever since about week 2 of our one-month installation.

      Orion 9.1, netflow 3 SP4 

      Orion app/web server:


      DL360


      Windows Server 2003, 64- bit w/R2.


      3 GHz cpu dual processors, dual core


      4  Gig ram 

       

      Orion SQL server:


      DL360


      Windows Server 2003, 64- bit w/R2.

      3.6 GHz cpu dual processors, dual core.

      Standalone Microsoft SQL server SQL 2005.

      4 Gig ram

      Raid 1

       

      SQL db data .mdf is almost 30 gig


      Logfile has 20 gig of space

       

      I have been running on these servers for about a month.  I updated to 9.1 and rebooted both servers this morning thinking that might make a difference.  It did not.

       

      Orion NPM web displays are not super-slow to display, only those pages with netflow data.

       

      Listing NPM resources of a node when on the app server is much slower than it used to be.  It can take 30 seconds to list interfaces of a node.



        • Re: Netflow can take 3 minutes to display
          Andy McBride

          Hi Debbi,


          Our next release of NTA is focused on performance. We have been testing using graph loading as our test measurement. It looks a lot better. Also there will be several new options you will be able to set to improve performance. NetFlow is very data intensive compared to most other network management technologies and this can affect performance.


          Andy

            • Re: Netflow can take 3 minutes to display
              Debbi

              This is great news, thanks.  Is the new version in customer testing yet?  If so, what is the feedback?


              I did open a support ticket when no one chimed in here right away and the tech identified my problem as a disk queue issue that could likely only be fixed by faster/better raid disks.  I will await the next version and see what happens before forking out more $$$.  It would be helpful, too, to have some guidelines in the "Requirements" section of the Netflow guide for what a customer would likely need hardware-wise based on how many flows are coming in or whatever.  What would be ideal would be some analysis tool provided by SolarWinds to run before purchase that would listen for netflow and simply report how many 'flows per second/gigs of data/whatever is relevant' are coming in needing to be written to disk, and based on that info, what would likely be required.  I think it would improve customer satisfaction with SolarWinds products and avoid disappointment.


              Thanks again.


              Debbi

                • Re: Netflow can take 3 minutes to display
                  greybirds2

                  Hello, Andy -

                  In addition to the slow graphing, we have issues with high CPU utilization with NetFlow.exe using 75-90%.  Our server is a quad dual-core processor box, 10GB RAM and Tech Support said it is fine.  They gave what sounds like the same advice as Debbi received regarding SQL storage write access bottlenecks - check for write times, logs, excessive queing, use a dedicated SQL server, RAID 0, don't use SAN , etc. etc.  Much of this seemed reasonable, but investigation showed this issue is not related to slow disk write performance.  SQL monitoring shows that there is no SQL queuing and and data traces  confirm that very little traffic is even being passed from Orion to the SQL db; basically the SQL server is yawning and Orion is flailing away fruitlessly.  The only thing that has helped the CPU load situation is to reduce the number of Netflow source interfaces, so it seems as though we just have too many NetFlow sources and therefore too many total flows for Orion to absorb, and the CPU is thrashing just trying to process the input. 

                  Any idea when the next release of NTA will be available and any chance it will help with this?  If not, like Debbi said, it would be very helpful if Orion would publish some guidelines as to the number of simultaneous flows it can process.  Other vendors do provide such information and if Orion is not capable of providing good performance for high-density flows without adding additional servers/polling engines/NetflowAdd-ons/dedicated SQL /expensive RAID configurations etc., that's  fine.  Considering all the other features it is providing it is understandable that it would not provide the same level of performance as a product dedicated to NetFlow processing, but it would just be very helpful to know approximate capacity in advance to avoid disappointment.

                   

                  Thanks!

                    • Re: Netflow can take 3 minutes to display
                      Don331

                      Folks,

                      What settings are you using for Netflow Settings-Global Settings - Uncompressed and Compressed data retention times?

                      We set ours very short (1 hour/30 days) as IMHO long-term netflow data isn't very useful. I've seen other shops go even shorter (1 hour/7 days)...

                      Don

                        • Re: Netflow can take 3 minutes to display
                          Andy McBride

                          Debbi - Graybirds,

                          Thanks for the great feedback. Can't give you dates now but look for an announcement soon.

                          Andy

                            • Re: Netflow can take 3 minutes to display
                              greybirds2

                              Look forward to hearing something soon and hopefully the next release will be better at handling large numbers of flows, or at least provide some guidelines as to how many it can handle.  Unfortunately Tech Support is adamant that our issue is with the SQL database being on a SAN (horrors) and is not open to looking at the situation until we take our database off the SAN and get it onto something that meets SolarWinds requirements.  While I do understand that a SAN can have slower disk read/write performance than dedicated storage, we have seen the high CPU utilization by the NetFlow process is not caused by slow SQL access.  The SQL server is not receiving many hits from Orion during the high CPU utilizatiion and each one gets handled immediately.  We have literally hundreds of Oracle and other databases that require instantaneous access running on the same SAN environment and if they can tolerate SAN storage, it is rather strange that SolarWinds refuses to "support" this.  If we are going to have to invest a large sum in a standalone Enterprise SQL license, server, storage, etc. for a NetFlow solution that handles the volume we want, it would probably make more sense to look at other products that can at least validate up front what they support.

                              Don't get me wrong, I love SolarWinds! It does a great job, even with the NetFlow (in moderation - guess it's like fine wine, eh?  too much is not good?)   

                                • Re: Netflow can take 3 minutes to display
                                  Debbi

                                  Yes, SW told me the same thing.  Moving it off the SAN did clear up the constant problems we were having with locked records, though we also upgraded a few times during that period which probably helped too.  However, now that we are on standalone servers that are up to minimum standards, they say we need faster drives.  It's always something...