15 Replies Latest reply on Mar 16, 2010 8:44 AM by cgregors

    DB problem

      Hello

      We have seperated the DB from the Orion Server because of slowness off the web interface, the DB server's hardware and setup is in accordance with the Solarwinds spec.

      After doing that the preformance did not improve, I have found that the problem is that the disk qoeueis on the DB server is hitting 100% whenever there is a request to view a page (most noticable on Node details pages see below).

      Since then I have had a professional DBA look into the issue and he found no problems with the DB structure or hardware.

      Please advise on poissible solutions as much as I like the product it is virtually unusable for me and my team.

        • Re: DB problem
          njoylif

          what are your specs...how many disks are you writing across, how many pollers do you have, what is your DB server - MS SQL 2k/5/8? is your DB server 64bit or 32?

          one thing that helped me when I was running 32bit was setting AWE in the DB Server memory properties.  How much of the DB Server memory is being used by SQL?  how much is free?  Another trick is to hard set the amount of memory SQL can use, ensuring that the OS has 2GB or 4GB RAM to use for system and other processes...

            • Re: DB problem

              Hardware : PDL380G5

              OS : Server 2003 R2 32bit

              DB : MS SQL 2005 32 bit

              OS disks : 2 SFF 10K 72G disks in Mirror for the OS

              Database Disk : 4 SFF 10K 146G disks in Raid 10

              Total Memory : 4Gb

              DB Memory Usage : 1.7Gb

              Pollers : One (663 Nodes, 5995 Interfaces, 138 Volumes)

                • Re: DB problem
                  sonic6t9

                  Well since it's a 32 bit OS you are maxing out your memory. IF you have 1.7 usage on just the DB that leaves it maxed out to run the rest of the OS processes.

                  • Re: DB problem
                    ecklerwr1

                    @Eitan

                    I would seriously look into the suggestions of njoylif:

                    when running 32bit was setting AWE in the DB Server memory properties (I have heard many people having great improvement after making this change)

                    hard set the amount of memory SQL can use, ensuring that the OS has 2GB or 4GB RAM to use for system and other processes (This is worth a try... Also check the following technet article on using ram beyond 4Gb in 32bit os and maybe add more ram.)

                    http://support.microsoft.com/kb/283037

                    Here's a summary:

                    This article describes Physical Address Extension (PAE) and Address Windowing Extensions (AWE) and explains how they work together. This article also discusses the limitations of using memory beyond the 4-gigabyte (GB) range that is inherent to 32-bit operating systems.

                • Re: DB problem
                  jtimes

                  Although the suggestions above may be valid; you might want to consider evaluating the content of resources contained in your Nodes Detail screen.  An example of "over doing it" would be to list all Syslog messages related to the specific node, limit the number of and or the time frame of Syslog messages.  Additionally limit the initial time frame for charts.  The more summarized the data the faster the individual resource will load.  

                  I have my charts on the Node Detail resource setup in this manor:
                  Time Period for Chart:  Today
                  Sample Interval:  Every Hour

                  To find out which resource is "slow" on your Node Detail screen (or any of your Orion pages) this next suggestion may or may not be an option for you as well:

                  Download HTTPWatch and run it while loading a Node Detail Screen

                    • Re: DB problem
                      ecklerwr1

                      jtimes is right here... if you can get by with a more limited (set of slow webpage resouces) but still good for your use case this could solve your problem without having to throw hardware at the problem.  The HTTPWatch suggestion is a great one to narrow down what specifically is taking so long.  Then you may be able to optimize what is displayed in that resource.

                        • Re: DB problem
                          cgregors

                          I've gone down a similar path in the past. I used fiddler instead of httpwatch. Below is the timeline for a node-display with 20 resource blocks on it.

                          As you can see, all the processing time is hidden inside the aspx file.

                            • Re: DB problem
                              ecklerwr1

                              So I guess unless you've got a tool like: http://www.red-gate.com/products/ants_performance_profiler/index.htm 

                              then maybe the old way of disabling/taking off one resource at a time to determine which one is taking the longest time processing might be the only method unless it's obvious while watching the page update.

                                • Re: DB problem
                                  cgregors

                                  I consider performance debugging the job of developers and Tier 2 support. In this case I would pass the problem back to Solarwinds rather than start taking the aspx pages apart.  Myself, I've got enough of my own code to debug and maintain rather  than creating work for myself by taking vendor code apart.

                                  I do remember that I was having problems with page load times > 60 seconds when I had 13,000 elements on the single NPM engine. I then reduced the element count to ~9000 and got down to the 15 second load time I show above.

                                  I was told by Tech Support that the number of resources on the page (and how they are configured) does have an impact on page load time. I would like to try stripping down the "display node" page to one resource and then start rebuilding it one resource at a time to determine if a specific resource is the bottleneck.

                                  The problem with that is there is only ONE "display node" page and if I start hacking on it, my customers will get annoyed. If there was some way of creating a "test display node" page then I'd be all over it.

                                  Chris

                                    • Re: DB problem
                                      ecklerwr1

                                      Couldn't you just use the copy button in manage views and make a copy of node details and work on it instead of your actual node detail view so it doesn't effect your customers?  You're right though Chris debugging the .net container shouldn't be something we should have to be concerned with.

                                        • Re: DB problem
                                          cgregors

                                          Thank you! I was wondering how to do this for a long time. This will come in handy and might make the customers happy. I can design a view tailored to the node type. For simple nodes, I don't need to load unused resources.

                                           

                                          I don't know why I didn't notice this before.

                                • Re: DB problem

                                  First of all thank you all for the assistance.

                                  The problem was the Trap table in the DB that was ~20GB because it retaind 180 days of data, I have reduces it to 30 days and the performance is uncomparable to what I got before.

                                  Still I have some big tables that cross the 4GB limit, in order to improve performance even more I an going to migrate the DB to a 64bit platform.

                                  Eitan.