15 Replies Latest reply on Feb 1, 2010 3:02 PM by justinh

    Eval over production

    justinh

      I'm out of maintenance for reasons I don't have the time or patience to go into.  I need to be running the latest version of NTA to figure out if my OutOfMemory exceptions will go away or not, but I can't install the eval over a production version.  Can I get this installed without actually uninstalling my current version?  This is seriously the most frustrating software I have ever had the displeasure to work with.

        • Re: Eval over production
          chris.lapoint

          justinh,

          If you're out of maintenance, you won't have access to the latest production version of NTA so it's unclear what you're trying to accomplish by installing the eval of the latest version.  Are you trying to verify an NTA fix before renewing maintenance?

          Not sure if we'll be able to help, but can you clarify exactly what you're trying to accomplish?

          Thanks,

            • Re: Eval over production
              justinh

              That's exactly what I'm trying to do.  I've already paid for a product that never worked for us, so I'm never going to be able to convince anyone to pay for maintenance unless I can prove it's going to work.

                • Re: Eval over production
                  ET

                  Hi,

                  please which version of NTA do you have? Have you contact our support department? OutOfMemory exception is pretty rare, and we solved all (as I know) issues which was reported by customers and related to OutOfMemory.... So if your case fits to some of in our KB .....

                  If you want to try eval, you need to install it on separate box and than redirect netflow traffic to it. ( and you need also separate DB, because eval will upgrade your previous NTA DB, and you could not use it with older version at all)

                   

                  thanks

                  ET

                    • Re: Eval over production
                      justinh

                      Hello ET,

                      I'm using NTA 3.1 according to my NTA settings page.

                      I haven't contacted your support department simply because I assumed that the first thing they'd want me to do was upgrade to the latest version, which I couldn't do without installing the eval version.  The OutOfMemory error is what's been plaguing me since I first installed this software.  I'm monitoring maybe 450Mbps of traffic on 4 interfaces.

                      I also didn't contact support because this is a long and on-going issue I've had, and I was hoping not to have to go over it all again.

                        • Re: Eval over production
                          chris.lapoint

                          Installing an eval over production isn't a supported use-case for a variety of reasons. 

                          Is it possible to install the eval on a separate machine and redirect your NetFlow exports to that server? 

                          If not, we can discuss other options.

                            • Re: Eval over production
                              justinh

                              Not particularly.  Part of the delay was reconfiguring our entire SQL back end based on recommendations from your staff.  Creating a new hardware setup (even virtually) would defeat the purpose of the work we've done over the last couple of months.

                                • Re: Eval over production
                                  chris.lapoint

                                  Understood. 

                                  There may be a way to install the eval over the production version.  Try the following:

                                  1. Back up your Orion database.   If you don't do this, there's no going back

                                  2. Park your current NTA license using License Manager (server must be able to reach the Internet or this won't work)

                                  3. Uninstall NTA 3.1

                                  4. Install latest NTA evaluation version

                                  Please note that I haven't tested these steps myself, so it's critical to back up your database.

                                    • Re: Eval over production
                                      justinh

                                      I don't know what to say.  You've put me in a position where, given my constraints, I can't buy your product even if I wanted to, or even try the current version.

                                      Suffice it to say, after having to justify a 7 1/2 hour downtime a week ago because of my upgrade to 9.5.1, I'm not going to get anyone to sign off on another potential catastrophe.

                                      If you want a suggestion, put "make it easier for lapsed customers to try current versions of our software without gutting their installs" somewhere on your list.

                                        • Re: Eval over production
                                          chris.lapoint

                                          I can appreciate your predicament, but unfortunately, the workarounds I've provided are likely the best we can do.   Just in case, I'll reach out to dev just to double check that I haven't missed anything.  

                                          It's important to realize that we generally don't have customers in this situation (no maintenance, can't install on separate machine) so there's been no driver to optimize for this use-case. 

                                          Thanks,

                                            • Re: Eval over production
                                              ET

                                              Hello justinh,

                                              I'm back in office, so you have 3.1, hmmmm. I have to admit, that in 3.1 (even with SP2) we realy had issue with OOM , it was related to DNS resolving, and we have also hotfix for this.

                                              Here is description from our KB:

                                              In some customer environments, where Orion NTA is monitoring a wide range of IP addresses, the SolarWinds NetFlow Service may stop with an out of memory exception after an indeterminate period of time.

                                              You may find errors similar to the following listed in the Application view of the Windows Event Viewer:

                                                  Critical error in NetFlow listener. Exception of type 'System.OutOfMemoryException' was thrown.

                                              This error can occur when the ‘NetBIOS resolution of endpoints’ option on the NetFlow Global Settings view is disabled. If this option is disabled when the SolarWinds NetFlow Service successfully resolves a DNS address, the memory allocated for the call is not released, as required.

                                                • Re: Eval over production
                                                  justinh

                                                  So here's an interesting oddity.

                                                  First, I re-enabled that option and I haven't crashed yet, and memory seems to be getting released regularly.  I'm crossing my fingers.

                                                  Second, my problem has always been that, in perfmon, tracking v5 flows received per second and raw packet queue length, eventually the number of flows would crash to 0, and the queue length would climb and climb forever.  Now, I get a situation where the flows will crash to 0, the queue length will plateau at some random number, and if I watch it long enough, it will eventually start processing flows again.  I haven't had enough time to really look at the numbers and see if it's losing data, but at least it's better than before.

                                                  Edit:  I was wrong.  Queue length doesn't plateau at some random number, it plateaus at 25600.  It's random whether or not it'll actually achieve that number.  Is this some built in maximum or is there some other reason for this behavior?

                                                    • Re: Eval over production
                                                      chris.lapoint

                                                      justinh, in NTA 3.5 and later, we added Top Talker Optimization, which has allowed customers to run successfully with up to 60,000 flows per second in peak.

                                                      Unfortunately, given you cannot upgrade and you're not on maintenance, we've pretty much exhausted our options to assist you.    Hopefully, you've seen enough to feel more comfortable that your issues will be resolved as you upgrade to later versions.  

                                                      If you'd like to email me directly, I'm happy to send you a reference implementation (HW/DB config) for the 60,000 flows per second.

                                                      thanks,

                                                      • Re: Eval over production
                                                        ET

                                                        exactly, 25600 is maximum. NTA service has smart data processing and it uses a lot of caches and memory buffers to handle situations when SQL is overloaded or we are running out of some resources. If all caches/buffers are full (Raw Packet Queue is the last one), it's possible that we start dropping incoming netflow packets, so you can see gap in your charts.

                                                        To prevent this we need to identify your bottleneck ( I would guess that it's your SQL and wide IP range, your DNS resolving was down for a long time, so all IPs are going to re-resolve and  it's can cause SQL impact) and make some performance tuning.

                                                        Thanks

                                                          • Re: Eval over production
                                                            justinh

                                                            Chris - Email sent.  Thank you.  I average ~10,000 flows per second among my borders, and tend to see peaks around ~50,000.  I can easily imagine it spiking up to around 60,000 at times.

                                                            ET - Our SQL database software (DB and OS) were just upgraded based on suggestions from SW staff.  I hope it's not that.  We are monitoring our entire IP range which comes out to the equivalent of 9 /19s not counting private address space.  I would hope that the DNS issue would trickle off at some point.

                                        • Re: Eval over production
                                          chris.lapoint

                                          That's exactly what I'm trying to do.  I've already paid for a product that never worked for us, so I'm never going to be able to convince anyone to pay for maintenance unless I can prove it's going to work.

                                          Ok, thanks for the clarification and thanks for giving us an opportunity to continue working with you.   What was the version you weren't able to make work?

                                          If it was 2.x or earlier, then I'd like to have a call with you to discuss your options.

                                          Thanks,