Every week, it seems, system administrators ask me how to roll back, or uninstall, patches that have gone wrong.

 

I tell them that if they’re even thinking about rollback before installing a patch, they’re asking the wrong question. Patch rollback is so difficult, risky and time-consuming you should avoid it at almost all costs. Instead, put in your time and effort up front so you can understand which patches are safe enough and important enough to deploy before pulling the trigger. In the long run, this means less work, less trouble and less risk.

 

Don’t believe me? Keep reading.

 

Rollback: Mission Impossible

 

For one thing, Microsoft doesn’t even provide a way to automate the mass rollback of operating system patches. That means writing scripts, which is time and effort you can’t afford.  Even worse, it might mean tracking down which devices you patched and roll them back manually. Then there’s the challenge of restoring any data that was lost due to resulting system crashes, and assuring you’ve found a “known good state” to restore the system to. Let’s not forget the time, and personal pain, involved in explaining to your users why this happened and how you’ll prevent it in the future.

 

Don’t get me wrong: Users have every right to be concerned about the quality of patches, which unfortunately is falling (at least in the Microsoft world.) But all the complications that come with packaging and deploying patches, such as discovering affected systems and tailoring the un-install for each variety of system configuration, also hold true for rolling back the patches. In fact, discovering and “reverse-engineering” the patch process is even more difficult since the failed patch may itself give inaccurate information about which systems have and haven’t been patched.

 

One scenario is that the patch successfully installed, and simply broke something. This would seem to be a simple fix – uninstall the patch, assuming that uninstalling the patch actually does restore all of the previous configurations. A more onerous scenario is that the patch installation is only partially successful. In that case, surgically extracting that patch may become even more tedious, and other things may get further damaged in the process. Depending on the extent to which the patch was installed, the WUAgent may report it as “Not Installed”, even though enough of it was installed to break something else.

 

The answer to the patch quality issue is not to try to build an escape hatch for yourself. It is to instead properly test patches before deployment, leverage patch tests done by others, as shared in such forums as PatchManagement.org., or simply skip patches for less critical systems or less common threats.

 

Test and Prioritize

 

I can hear you now: Who has time to test the dozens of patches I get every month? You can get some level of assurance from commercial packages such as Patch Manager, which test patches to ensure they detect the systems on which they should be installed and that the installer is properly initiated where it should be.

 

You can also use snapshots (virtualization capability) to test patches. Take a snapshot of each virtual machine before applying a patch, or patches. If any issues occur after patching, simply apply the snapshot to restore the machine to the pre-patch state and decide if the patch is important enough to find another way to install it. This can be an effective way to identify a single problematic patch out of a larger group with no risk to production systems.

 

Another common-sense technique is to deploy patches in a pre-production environment before trusting them on your production systems. This is usually a much more limited, and thus manageable, environment in which to track variations among systems and the “before” and “after” state of patched systems. The downside is that, not being a true production system, you may not catch all the problems a bad patch can cause.

 

Another easy way to avoid, or at least reduce, patch problems is to prioritize which patches you deploy. This doesn’t have to be a long and expensive process. Many admins have rough rules such as deploying “only critical security patches, and only on mission-critical systems.” Avoiding “functional” (rather than security) patches may not be elegant, but it’s a good way to beat back the majority of serious threats without wasting your time on optional patches.

 

Bottom Line: Don’t waste time chasing a rollback option that really isn’t an option. Instead, invest some effort up-front in testing and prioritizing your patches to make sure your patches don’t need to be undone.