In this post, my intention is to provide some guidelines about how to effectively manage and schedule patching in your organization.
(Editor’s note: PatchZone welcomes Augusto Alvarez. Augusto is no stranger to blogging - he has served as a thwack ambassador and has his own blog. He is now celebrating the publication of his second book, "Microsoft Application Virtualization Advanced Guide.")
Having a detailed and effective systems’ patching strategy is something that several organizations don’t pay much attention. Why is that? Most people think that it’s too expensive to have a plan and an entire platform for something as simple as clicking in “Install updates” in Windows Update. However, most people realize that they should have a plan when something goes wrong: getting a blue screen when rebooting a server; services not starting; applications unexpected downtime; or even worse, a security breach with an out-of-date system.
What do I Need to Know about Microsoft Schedules?
Microsoft has an unofficial strategy about when they release the updates. The second Tuesday of each month they release security patches; the critical updates are the exception, because those are released as soon as the patch is ready. Tuesday is the selected day, because the following approach:
• Tuesday: Updates are released (around 17:00 – 18:00)
• Wednesday: Apply updates in test environment.
• Thursday: Run use cases in test environment with new updates installed.
• Friday: Install updates in production.
• Saturday: Reboot your production servers.
Do I Really Need to Have a Test Environment?
The short answer: yes, you do need a test environment. But, let’s elaborate this for those that always try to avoid this matter. If you cannot afford the replication of your full production environment (servers and workstations), you can still use some reference machines currently in production but only those with low or no impact if something goes wrong.
For example, to test workstations you can use the IT department’s reference machines to test new updates or any other “friendly” user’s machine that won’t mind having to troubleshoot if some update does something unexpected. For mission critical services running on servers, most organizations should have a high availability scenario (for example: SQL Server cluster), and you can start patching using the “stand-by” node.
And of course we always have the virtualization alternative. Having one server (or even computer) with some resources available we can virtualize at least our main services and turn on those VMs when we need to test a new update; we can even use VMware Converter o System Center Virtual Machine Manager to convert Physical to Virtual Machines and place these machines in an isolated environment.
What do I Need to Know about my Environment?
Understand your environment, applications and services that need to be patched. There’s no need to have a good plan, and a schedule and test environment if you don’t know the right use cases of each platform you are updating.
If there’s a homemade application, request a developer to script or give you a simple test to run in order to validate the application is working properly. The same applies for other platforms, like a database server, messaging server like Exchange, SharePoint or any other; you must have a few tests to run when you update your platform, if those are automated, even better.
Should I Change my Backup Plan if I have a Good Test Environment?
If you are thinking to have a more relaxed backup plan, the answer is no. There are some obvious reasons why - hardware failures and user errors can still occur in production; but even if we don’t consider those, having a test environment is no “silver bullet”. There will be scenarios where the behavior in testing can be slightly different than in production, and that “slight difference” can make a huge impact if we don’t have a way to recover it.
Even more, review your backup plan and ensure proper testing for those backups is being executed periodically.
Do I Need to Review my Current Service Level Agreement (SLA)?
Yes, of course. This is an important matter since, if we have a defined SLA, we will know the downtime windows we can have in our environment, therefore we will understand the schedule for applying updates we need in our organization.
And we can also find in the SLA, the priority for some services that will give us the input whether to implement a test environment for that service. For example, a SLA that requires high availability for the messaging platform will need to have replicated servers to properly test new updates.
Do I Really Need to Document the Patch Management Processes?
Please do. This is not just any other boring process for IT. For example, properly documented steps for testing will give us a way to guarantee repeatable and predictable steps in production.
As I always say, “there’s no golden rule” in the IT world but you can find general guidance and best practices. The best solution suited for your organization won’t apply in the next company; we must always assess and understand our environment, taking into account several key factors like: budget and costs; internal policies; legal compliances; defined SLAs and so on.