Say you get handed a lab full of virtual machines and hosts.  You are now responsible for everything to do with that lab. People will vilify you when they have problems and ignore you when they don't.


What's your first move?  Are you going to ignore it?  Hope no one notices who is now responsible? Dogmatically reassure yourself that it's fine - everything is just fine?


Probably not, or at least not for long.


One of the first things I'd do would be figure out how well the lab is currently performing; what, if any, changes need to happen so the lab can continue to perform with its current capacity and load; and project when that capacity will run out.


What is Capacity Planning?


Capacity planning in virtual infrastructure terms basically means to determine the amount of computing power is available, the amount of computing power that is used, and how that load is distributed.  If you have the historical records, you then plot how consumption of computing power has increased or decreased over time to predict when you will need to expand your resources.


I personally think that a hidden side of capacity planning involves talking to users. The software/machine-driven side of capacity planning doesn't necessarily take into account the human experience. For example, your threshold for an overloaded host might be set higher than the threshold of the people using the VMs on that host. If you talk to them, you can readjust your thresholds before they can complain or you can both readjust your thresholds.


So how do you figure out the amount of computing power that is available?


Manual Capacity Planning


If you don't have a tool, this is a painful, thankless process.


First you need to calculate the current capacity of your virtual infrastructure.  This doesn't sound terribly hard, right? You need to record your CPU speeds, number of CPUs, total memory, etc. per host or cluster. Then you have to do the same per VM.


Then you calculate your used capacity by first determining the peak hours per host. And then you have to record the utilization data per VM, host, and cluster during peak usage. And then you should probably discount the observer effect, or consider that some minor overhead.


By the time you peak usage time is over, you might have recorded half the data you need, if you have a small-ish lab.  Plus, you need to do this frequently so you can accrue metrics towards your future needs and discount statistical outliers.


After you gather that data, then you can determine how well the computing load is distributed and make changes as necessary.


What this boils down to is a lot of work, educated guesses, and spreadsheets that are rarely updated.


What Do Tools Do for Me?


If you are the VMware capacity planner, you would understand how manual capacity planning is a lot of time and effort.  Thankfully software companies, such as SolarWinds, have made this process easier with capacity planning tools. While the tools don't take all of the effort out of capacity planning, they do most of the heavy lifting. The tools are able to profile your VMs and hosts and associate the VMs to the correct host.  They can record usage history, CPU utilization, memory utilization, configuration details, and other juicy, hard to get at and tally data. Tools can also keep track of historical data related to potential capacity woes, such as latency, I/O, and storage usage.


In addition to the automated, heavy data collection, capacity planning tools indicate your current capacity and use historical data to predict when you will run out of capacity with your current infrastructure. They should also predict when you will run out of capacity based on new information, like extra memory or a new host.


For more information, see this SolarWinds video on capacity planning with our Virtualization Manager for VMware monitoring and become an expert on vmware capacity management.