I wrote last week about how performance troubleshooting is a problem with multiple dimensions. From the client, across the network, and into the application server, there are many places where application performance can be impacted. One of the key parts of performance troubleshooting is sorting through all those dimensions, then finding the one that is limiting application performance. Unfortunately, there are many more dimensions when your applications are inside virtual machines. Understanding and considering these dimensions is critical to performance troubleshooting in a virtual environment.
The first place we add more dimensions is between the VM and the hypervisor. VM-sizing as well as virtual HBA and NIC selection all play a part in application performance. Then there is the hypervisor and its configuration. A lot of the same issues that affect operating systems also effect hypervisors. Are updates applied? Is the storage configuration correct? How about NIC teaming? Even simple things like storage vendor recommended optimization can make a big difference to the performance of applications in the VMs.
The next dimension is that multiple VMs share a single physical server. So your application’s performance may be dependent on the behavior of other VMs. If another VM, or group of VMs, uses most of the physical CPU, then your application may run slow. If your VM resides on the same storage as a bunch of VDI desktops, then storage performance will fluctuate. We see this noisy neighbor problem most often in cloud environments. But noisy neighbors absolutely can happen in on-premises virtualization. A particular challenge is the invisibility of hypervisor issues to the VM and its applications. The operating system inside a VM will report the clock speed of the underlying physical CPU, even if it is only getting a small fraction of the CPU cycles. A VM that uses 100% of its CPU is not necessarily getting 100% of a CPU, it is just using 100% if what it gets from the hypervisor.
There is a time dimension here, too; one related to the portability and mobility of VMs. Your application server may move from one virtualization host to another over time. So can other VMs. The noisy neighbor VM that caused a performance issue half an hour ago may be nowhere near your VM right now. On most hypervisors, VMs can also move from one piece of storage to another. That performance issue yesterday might have triggered an automated movement of your VM to a better storage location. The migration made the performance problem disappear. This VM mobility can lead to performance issues being intermittent, they only manifest when a specific other VM is on the same physical host.
Another virtualization dimension is when the user’s PC is really a VM. VDI adds more dimension to performance. There is a whole network between the user and their PC, as well as another device right in front of the user that may bring its own performance issues. Users seldom have the right words to differentiate a slow VDI session from a slow application. All of the noisy neighbor issues we see with server VMs are multiplied with desktops.
Virtualization has enabled many awesome changes in the data center. But virtualization has added serious complexity to performance troubleshooting. There are so many dimensions to understand to find and resolve the restricting dimension.