10 Replies Latest reply on Jul 19, 2013 12:55 PM by Lawrence Garvin

    July Month patch for 600 systems ran more than 12 hr

    rnaren6

      Can some one help me to figure out why this month Microsoft patch for 600 systems ran more than 12hr instead of 2hr window which was scheduled @ task.

        • Re: July Month patch for 600 systems ran more than 12 hr
          Lawrence Garvin

          There are actually many possible reasons why this might have happened.

           

          First, 12 hours is a lot more realistic for patching 600 systems than 2 hours is, so I think it might help to have some additional background on how you normally patch 600 systems in 2 hours.

           

          Second, the mix of updates in a patch cycle can significantly impact the amount of time it takes any given machine to install patches. Of notable impact, .NET updates can significantly increase the total installation time.

           

          If you review the Details tab of the Task History item for this scheduled update deployment, you'll find there is a column named "Completion Time". Sorting on this value might provide some insight as to what was running when. Within that 12-hour window, how many systems were patched in what time frames. For example, it's entirely possible (although quite unlikely) that a single system (or a few systems) could have hung the task and caused the task to "run" for 12 hours, but in reality, 99% of the systems were actually patched in the first few hours. Another possibility is that a network link was down and an entire subnet or site might have been inaccessible, contributing to a delay. It's also not impossible that the "12 hour" Completion Time is a completely false indication -- all of the clients may well have been patched in the first few hours, but the *task* did not properly terminate.

           

          Also inspect the "Server Executed On" column. If you're patching 600 systems in 2 hours, then it's quite likely that you're using multiple Automation Role servers. Check to see that all of your Automation Role servers were actually in-use during the execution of the installation task.

           

          It's also possible that resources on one or more Automation Servers were not available at that time, which could result in the Automation Role server working the task at less than 100% efficiency. It's also possible that another task was running concurrently, using up the resources that would have been used by the update installation task.

           

          Another variable that can impact total task execution time is the quantity of targeted systems that are inaccessible or non-existent. The task execution engine has a notable amount of retry effort built into establishing a WMI connection to a target system, and even on a perfectly working client, building a WMI connection is a fairly expensive (read: time-consuming) task. Waiting on a thread to timeout trying to connect to a system that's not available keeps that thread from connecting to a system that is available.

           

          The Task History\Details is definitely the place to start in order to get a more detailed perspective of what happened during the time the task was executed.

           

          Finally, you can export the Task History\Details to an Excel workbook, and I'd be happy to take a look at the task execution and offer my thoughts.

          2 of 2 people found this helpful
            • Re: July Month patch for 600 systems ran more than 12 hr
              rnaren6

              hey Garvin,

               

              Appreciate your time and valuable feedback on this , below are the answers to your questions :

               

              • -As per my understanding the patching happens parallel hence expected the 600 systems to be completed in 4 hr window as I had approval only for 4 hr of maintenance window for patching at no cost i can exceed the same. Is there any best practice i can follow so that will be able to accomplish the task.
              • As I approve the patch at PM the patches get downloaded to WSUS , it is pushing the patch to clients at the said time is my understanding correct ?
              • There was no network issue
              • there is only one automation server in the environment
              • I have attached the logs for your reference, kindly go through and feel free to express your insights.
                • Re: July Month patch for 600 systems ran more than 12 hr
                  Lawrence Garvin

                  Okay, perhaps I can help clarify how the task actually executes, and then that may put this in better perspective.

                   

                  The patching happens in what we might call quasi-parallel mode. It does not do all 600 systems simultaneously; in fact, a single Patch Manager server can do about 10-12 systems simultaneously. Assuming 10-15 minutes to install updates on a single target, it's possible with a single Patch Manager server in its default configuration to patch a few dozen systems in an hour, more if the installation cycle runs faster on some or all of the systems. This scenario also presumes that the updates have been downloaded to the client systems prior to the installation task; if the updates have not been downloaded, cut that number by half, maybe more. That is to say, if the task includes downloading updates, you'll likely see no more than a couple dozen systems patched in an hour, as the majority of the time will be consumed doing file transfers from the WSUS server.

                   

                  There are a number of ways in which we can increase this parallelism. If the Patch Manager server is installed on a multi-core/multi-socket system, you can increase the number of worker processes and/or threads that are used to execute a task. By default the Patch Manager server is configured to run two worker processes with a thread pool size of 16. The thread pool can be increased to 256 per process, and the server can be configured with up to 8 worker processes. The objective is to increase the thread pool and number of worker processes up to the point that you achieve maximum CPU utilization without running out of process memory space. (Paging processes/threads to disk will destroy any benefits achieved from launching more connections to client systems.)

                   

                  Another option is to deploy additional Patch Manager Automation Role servers. The Automation Role is the service that initiates and monitors the task execution. If you're patching systems on remote sites, there will be significant benefit in having an Automation Role server on the local network. If you're patching a large number of systems in a single site, a pool of Automation Role servers can exponentially increase the parallelism of the patch deployment. In one case study, a data center with over 700 servers is being patched in four one-hour cycles of approx 200 servers each cycle, using a pool of four Automation Role servers. Each Automation Role server patches about 50-60 systems in an hour. The important note here is that to successfully patch several hundred systems within a specified time frame will require an architected solution with baselining and performance management, as well as a strong awareness of the work to be done during that installation cycle.

                   

                  Regarding the relationships of the file transfers from Microsoft to WSUS to clients. When you approve an update, the WSUS server queues that update's file(s) for download from Microsoft. Depending on the number of updates approved, and the available Internet bandwidth, this download task may last from a few minutes to several hours. Once an update's file(s) have been successfully downloaded to the WSUS server, that update is then available for download by the WSUS client systems. However, this download event does not  occur immediately. It occurs in a staggered fashion, as each client executes its regularly scheduled detection event looking for new updates. The default detection interval is 22 hours (which functionally is something between 17.6 and 22.0 hours), so from a practical perspective approximately 5% of your systems will launch download tasks for an update during each hour for the 24 hours after you approve the updates. As noted above, the objective is merely to ensure that the clients have completed those downloads  prior to launching the installation task, or else accept that the download will occur as a part of the installation task, and adversely affect the number of clients that can be patched in a given time frame.

                   

                  Looking at the Task History Details you've provided I also see another significant impact of execution time, and that's the launch of the pre-installation reboot. For every system performing a pre-installation reboot, add another few minutes to the per-system execution tally. With pre-installation reboots, I would expect per-system execution times in the 15-20 minute range, and therefore no more than a couple dozen completions per hour. We see that 12 systems were issued reboot commands at task launch (5:12am). If we look at a couple of these examples we can get some empirical indications of expected performance. Machine 'ooo311bh.com' executed a pre-installation reboot at 5:12am, and a post-installation reboot at 5:41am, so in this case the installation itself took approximately 25 minutes. A more positive example, '000313bh.com' has a pre-reboot at 5:12am and a post-reboot at 5:19am, completing the installation of updates in only a few minutes. Using these "Pre-Reboot" and "Post-Reboot" events alone actually gives you a very good trace through the task history to see how many machines were being processed during each hour and how long the installations (on a per-machine basis) actually executed.

                   

                  For a more per-machine chronological look, you can sort by [1] JobID and [2] Completion Time to see the impact on each machine individually.

                   

                  The other observation I'll make is that many of these machines are installing a large collection of Microsoft .NET Framework updates ... there were seven in total available for installation ... and as noted in my previous reply, .NET Framework updates are notoriously time consuming and I don't find it surprising at all that installing a half-dozen .NET updates consumed 20-30 minutes on any given machine.

                  1 of 1 people found this helpful
                    • Re: July Month patch for 600 systems ran more than 12 hr
                      rnaren6

                      Garvin,

                       

                      Again much appreciated for your time and reply , thanks for explaining the back end algorithm , i did understand now

                       

                      "When you approve an update, the WSUS server queues that update's file(s) for download from Microsoft. Depending on the number of updates approved, and the available Internet bandwidth, this download task may last from a few minutes to several hours. Once an update's file(s) have been successfully downloaded to the WSUS server, that update is then available for download by the WSUS client systems"

                       

                      Does the clients automatically download the required patch once it is available @ WSUS ? and installs the same when it is scheduled by a task. or should we create a separate task for patch movement from WSUS to clients ?

                       

                      I am planning to turn-off the automatic updates @ clients , is it possible to push the same across all the clients using PM something similar to Power-Shell , is yes can you share SW documentation pl.

                       

                      I am new and took this patch management task for 1800 clients, need to gain knowledge on this tool , apart from the administration guide which mostly talks about the installation is there a document which talk how to work and different functionality this tool provides and hot to add a new system to already existing group......

                        • Re: July Month patch for 600 systems ran more than 12 hr
                          Lawrence Garvin

                          The client will automatically download the file from the WSUS server, but not immediately upon availability at the WSUS server, rather that event is triggered when the client performs it's next scheduled detection and thus recognizes that the update is needed, approved, and the file is available for download.

                           

                          You can explicitly trigger this download using the Update Management task with the "Download Only" option once you know the file is on the WSUS server.

                           

                          If you launch an installation task and the file is not yet downloaded, but it is available, the client will immediately download the update file in order to perform the installation.

                           

                          I'm not quite understanding your last question (next to last paragraph). As for additional material regarding the operation of Patch Manager, some of this information is posted as blog articles in the Geek Speak blog as relates to Patch Manager directly, and PatchZone.org has volumes of information about the operations of WSUS. Also, if you've not already been there, the TechNet WSUS forum is a great place to get in-depth information about the behaviors of WSUS and the Windows Update Agent.

                  • Re: July Month patch for 600 systems ran more than 12 hr
                    antwesor

                    I sometimes have tasks run for long stretches when one or more PC's are not connected and the task continually tries to look for the PC's that are not turned on. Just a thought.