17 Replies Latest reply on Oct 8, 2009 2:57 PM by jeff.stewart

    Monitoring Orion for Performance?

    ecornwell

      Hello,

      I opened up a case because I noticed I was missing some data and the web interface felt slow.  I've seen others post on this topic but I've never really found what I've needed.  As part of the case, I was told it looks like the problem might be because of the database.  I've seen the system requirments and we currently don't follow a few of them.

      1. The DB Server is a VM.
      2. The DB/VM was on local storage which was RAID 5
      2a. We moved the VM to our SAN. 

      Moving to the SAN seemed to help a bit but not as much as I was hoping.

      I'd like to be able to show proof that we are really running into a problem but I'm not sure what to monitor.  I'm no MSSQL expert and we don't really have anyone on staff that is a true expert in it either. 

      We have APM and I'd like to be able to use it and NPM to really show that there is a problem if there is one.  Can anyone give me a good place to start monitoring and what to look for?  

      We have the server monitored with NPM and we've used the default APM monitor for MSSQL 2005 server but I don't really know what I'm looking at and what values are important.

      Thanks! 

        • Re: Monitoring Orion for Performance?
          bshopp

          I would start with Disk Queue Length for DB performance monitoring using the APM SQL Server template

            • Re: Monitoring Orion for Performance?
              jeff.stewart

              Ecornwell, I just added this counter to watch out DBs in APM.   Let me know if you have any questions about it.

                • Re: Monitoring Orion for Performance?
                  ecornwell

                  Thank you both for the response.  Here is what I'm currently using.

                  SELECT AvgDiskQueueLength FROM Win32_PerfFormattedData_PerfDisk_PhysicalDisk WHERE Name="_Total"

                  Is this what I'm looking for or do I need to break it out more.  Also, what is considered a "bad" value.  I'd like to set the warning and critical states.  The high for today was about 330 and it looks like it was during the DB maintenance.

                    • Re: Monitoring Orion for Performance?
                      jeff.stewart

                      Are you using the Performance Counter witin APM?  This is what I'm monitoring;

                      I've always heard that you want it under 2.0.  Brandon is this correct?

                        • Re: Monitoring Orion for Performance?
                          bshopp

                          I think that is correct, here is a good article Dev gave me, see here, but less than '2' is good number for disk queue length. 

                           

                            • Re: Monitoring Orion for Performance?
                              ecornwell

                              Thank you for your help.  I've added the performance monitors and I'll keep track of them.

                                • Re: Monitoring Orion for Performance?
                                  ecornwell

                                  If I could get APM to be stable, I might be able to get some data.  I checked before I left and it was collecting data then when I checked this morning it hadn't collected in 12 hours.  I checked and both the Job services weren't running.  I restarted them and got 1 collection and now it's been over 50 minutes since the last.

                                  Yes... I opened (another) ticket.... 

                                  Both NPM and APM have seemed extremely unstable recently.  I had our DBA's put in something so it emails me when the NPM service stops putting data into the database.  We took down the servers the other day to move the Database server from local storage to the SAN.  It hasn't alerted since then but the APM service seems to be the one that is giving me fits now...

                                    • Re: Monitoring Orion for Performance?
                                      jeff.stewart

                                      Any ideas as to where to look to see what might be causing this?

                                      • Re: Monitoring Orion for Performance?
                                        warbird

                                        To get an idea of what your avg disk queue length is doing, during a busy part of the day for your network, log into your SQL server, click start, run, type perfmon, and hit enter.  One of the default counters already in place is the Avg Disk Queue Length. 

                                        Click the little light bulb button to highlight the counter you have selected, then click on the Avg. Disk Queue Length counter.  The line on the graph representing disk queue length will be highlighted.  Regardless of the highlighting, clicking on the avg disk queue length will show you the relative numbers (last count, min and max, and the average over the last 1.4 minutes).  Watch this for a while.  Keep a close eye on that average number.  Per the Orion admin guide, that number should remain below twice the number of spindles that the db is stored on.

                                        This is what I used to determine the disks on my old SQL server were not fast enough.  I was getting insanely high average queue length (400 - 800).  We were also running RAID5, which was half the problem.  The other half was the disks were simply not fast enough.  I took many screen shots of the perfmon window and a copy of the pertinent section of the Admin quide to my boss.

                                        If the avg disk queue length ends up being your problem, let us know.  It was with the advise from other folks around here that I solved it for my organization.  Ended up having to do a couple of different things.

                                        • Re: Monitoring Orion for Performance?
                                          warbird

                                          BTW: I just saw you said you moved your db to a SAN.  Is your SQL server connected to it via fiber?  How many spindles?  Is your SQL server 32 bit Windows?

                                          Just moving my db to a very high end SAN wasn't good enough.  I also had to implement AWE to allot SQL more memory.

                                            • Re: Monitoring Orion for Performance?
                                              patriot

                                              Sorry, but I have a dumb question. When I look at the average value for Avg Disk Que Length in Perfmon, I see numbers like 0.050. Is that really 0.05, or should I read that as 50? When you say a desirable limit is 2, do you mean 0.002 on the counter, or actually 2?

                                                • Re: Monitoring Orion for Performance?
                                                  warbird

                                                  That is actually .02 and that is a very good number to see.  If it were "50", it would actually read something like 50.04 or whatever.  When I first inherited our Orion implementation, I was seeing numbers between 400 - 800.  No joke.  Right now, I am looking at a single spike within the measurement window such that the maximum number read 50.964 but the average (which is the important number) is still very low.

                                                  The admin guide indicates the average number should remain below 2 times the number of actual hard drives in the system that your db resides on.