1 Reply Latest reply on Oct 24, 2013 8:37 AM by bhopkins

    Virtual Machine Latency Top-N:VM Latency Hour


      Hi Everyone,


      I am new to the forum and also to the Virtualization Manager product so I was wondering if you could offer some advice please. One of the things I am using the product for is to try and get an understanding of how our storage is performing (IOPs\Latency) and inturn whether this is effecting individual Virtual Machine performance (IOPS\Latency).


      Reviewing the Top-N:Datastore I/O Latency Hour widget none of our datastores are showing any total, or read\write latency values above 20ms, with the highest datastore showing a value of 8.5ms. One of the Solarwinds 2 minute tutorial videos stated that at a datastore level anything below 20ms was generally ok with anything over needing to be addressed so based on this we are in good shape.


      However looking at the Top-N:VM Latency Hour widget we have a number of Virtual Machines that are reporting Total and Write Latency much higher than 20ms. My questions are:-


      1) Does the 20ms rule generally still apply to Virtual Machine latency (as it does with Datastore latency)? If not what value should we generally benchmark against?

      2) If it does, does anyone have any idea why we would see high latency at a Virtual Machine level, in some instances, but not at a datastore Level? Could there be a bottle neck\contention else where that is causing this and if so could you point me in the direction of how this could be confirmed using the solution.


      Thanks a lot

        • Re: Virtual Machine Latency Top-N:VM Latency Hour

          Me being the ex-netapp employee I will try to explain this a little bit for you:


          So you are right in the datastore latency needing to be below 20 ms.  There are different latency situations however.  You have latency from your Storage Controller/Datastore to VMware if you are using NFS (Like us) then that is one piece.  Once you have the datastore in vmware there is also vm disk latency.


          Most of the time if you are seeing vm disk latency it typically is the following:

          • Resource contention on the disks, meaning you are doing more IOP's than the disks can handle therefore causing latency to writing or reading from disks.
          • Sometimes you can have a vm running out of control doing more IO than it should and this can cause latency for the other vm's trying to access the disks, (I had this issue with our vcenter vm a while back)


          So I would investigate the IO per datastore and compare that to how many disks in your aggregate and make sure you are not doing more than your disks can handle, (If you are a netapp shop I can help with this).  I would also check the individual vm's and see what their IO is and make sure you don't have a bully vm on a datastore causing issues.


          If you are a netapp shop other things that can cause issues are long running dedupe jobs during the day, misaligned vm's,hot disks, etc.


          Hope this helps and let me know if you need more explanation.