Converting Physical Database Servers to Virtual Machines

I’m in the middle of tough project right now, a large client is trying to convert a large number of physical SQL Servers to virtual machines. They’ve done most of the right things—the underlying infrastructure is really strong, the storage is more than an adequate, and they aren’t overprovisioning the virtual environment.

Where the challenge is coming in is how to convert from physical to virtual. The classical approach, is to build new database VMs, restore backups from the physical to the VM, and ship log files until cutover time. However, in this case there are some application level challenges preventing that approach (mainly heavily customized application tier software). Even so, my preferred method here is to virtualize the system drives, and then restoring the databases using database restore operations.

This ensures the consistency of the databases, and rules out any corruption. Traditional P2V products have challenges around handling the rate of change in databases—many people think read only database workloads don’t generate many writes, but remember you are still writing to a cache, and frequently using temp space. What challenges have you seen in converting from virtual to physical?

Parents
  • We encountered significant problems when attempting to P2V database servers.  We use VMware and have a host of MS SQL and Oracle databases.  The major issues we encountered with P2V were the problems with booting Windows after the P2V (usually driver freak-out problems) and LUN mappings.  After about 5 failed attempts with some of our bigger databases, we switched to a different method.

    Our usual process was to stand up a new Virtual Machine and backup/copy/restore the data.  While a bit slower because of the additional backup and restore time (copy time was pretty close to the P2V time, and the new VM was built while still using the physical hardware), we had the opportunity to upgrade the Operating System and/or the database software during this process.  We actually performed a cutover from the old server to the new server; that is, we swapped the names and IP addresses of the old and new servers. Thus, we did not have modify the applications that are using the database.

    An additional catch-point we encountered was that the VMs needed to be configured with a second NIC.  Sure the VM infrastructure provides network redundancy, but having only one NIC on a VM still incurs the potential for bandwidth saturation.  We encountered this quickly as the backups began to run.  To solve this, we added a second NIC to each VM database server, put the NIC on its own non-routeable network for backup traffic and adjusted the routing tables in the OS so only the backup server could be accessed through that second NIC.  After that, we no longer had bandwidth saturation on the main interface when backups ran.

Comment
  • We encountered significant problems when attempting to P2V database servers.  We use VMware and have a host of MS SQL and Oracle databases.  The major issues we encountered with P2V were the problems with booting Windows after the P2V (usually driver freak-out problems) and LUN mappings.  After about 5 failed attempts with some of our bigger databases, we switched to a different method.

    Our usual process was to stand up a new Virtual Machine and backup/copy/restore the data.  While a bit slower because of the additional backup and restore time (copy time was pretty close to the P2V time, and the new VM was built while still using the physical hardware), we had the opportunity to upgrade the Operating System and/or the database software during this process.  We actually performed a cutover from the old server to the new server; that is, we swapped the names and IP addresses of the old and new servers. Thus, we did not have modify the applications that are using the database.

    An additional catch-point we encountered was that the VMs needed to be configured with a second NIC.  Sure the VM infrastructure provides network redundancy, but having only one NIC on a VM still incurs the potential for bandwidth saturation.  We encountered this quickly as the backups began to run.  To solve this, we added a second NIC to each VM database server, put the NIC on its own non-routeable network for backup traffic and adjusted the routing tables in the OS so only the backup server could be accessed through that second NIC.  After that, we no longer had bandwidth saturation on the main interface when backups ran.

Children
No Data
Thwack - Symbolize TM, R, and C