Performance in virtual world is complex. But it's one of the most important topic as well! Every admin I meet puts performance on the first (second) place of his priorities. The other priority is Disaster recovery. Depending of the environments, admins usually prefers to deliver best performance, but cares about fast backups (and recovery) too. No one likes the users complains about performance, right?
Virtualization brought another layer of complexity - Performance bottlenecks. In full virtual or mixed environments where some shops may also add another complexity with two different hypervizors, things might not be very simple, but rather complex to solve.
Usually there isn't only a single bottleneck as usually bottlenecks are followed with miss-configurations on the VM side. There are some principal miss-configurations that often repeat. Miss-configuration on the VMs side, and also some space waste the infrastructure side as well, which can lead to loss of performance. Here are few areas where to seek for improvements:
- VMs with multiple vCPU when only single vCPU needed - start with less to add more later.
- VMs with types of network adapters not adequate for the OS - check the requirements and follow the documentation (RTFM)
- VMs with too much memory allocation - does that VM really needs 16Gb of RAM?
- Storage bottlenecks - we all know that storage is the slowest part of the datacenter (except flash). It's changing with server side flash and different acceleration solutions, but also with hyper-converged solutions like VMware VSAN. But the problems might be on the storage network or HBA.
The virtual environment struggles also with the fact that sometimes in smaller shops there aren't much rules on who does what. The case when several admins, plus a developer team takes the virtual infrastructure for their playground. Everyone creates VMs which then sits there doing nothing, VMs snapshots laying around consuming valuable SAN disk space and suddenly there is not enough space on the datastores, backups stops working or VMs are having performance problems. There is a name for that - VM sprawl!
Here also, some order first please. Define rules who does what. Which VMs shall exists and in which situation is important to keep snapshot (or not). Think of archiving VMs rather than creating snapshots. There are some good backup tools for that.
Modern virtualization management tools can help in a situation when suddenly a performance is falling. Some of the tools integrate a possibility to detect configuration changes in the virtual environment. It means that you can see what changed the day that the performance started to be bad. It not always help as there are other factors that falls outside of the scope, and the tools might be monitoring the changes only of the virtual environment itself and not the outside physical world. But it can narrow the problem.