We deployed patch manager a few months ago and did testing with several machines set up to test with. These tests seemed to go well, with the patches getting installed as they should and rebooting as expected.
After this testing, in October, we ramped it up to patch most of our environment. That didn't go quite so well.
I have servers broken out into groups - Domain Controllers, File servers, SQL Servers, Terminal servers and Utility Servers, so that I can control when each type reboots - I prefer DC's to go first, followed by File, then SQL, and after that any order.
The main issue we had that cycle was a lot of machines seemed to get patched, but they didn't reboot. There were also a lot of patches that didn't install for one reason or another. We attributed it to the fact that for some reason, when our main DC/DNS server rebooted, DNS didn't come back right, causing DNS lookup issues. Since that was the only DNS servers that our machines were using, when it wasn't working, obviously that would cause issues. Since then, all servers have been set to look to two DNS Servers, so if one isn't available, they should still be able to resolve off the second server.
This month we had similar issues. Two of the smaller patch tasks that only have two and four machines each at the moment (DC's and File Servers) both returned 100% success.
According to the task history for the jobs, for the Utility servers group, only 28 of the 42 servers in the group had any tasks performed on them - successful or otherwise. The Terminal Servers group had similar results. Only 24 out of 33 servers in the group had any action performed on them by Patch Manager.
Of those servers that had actions performed on them, there were some failures. I believe some of them may be just 'normal' windows update failures, and some seem to be related to a pre-install reboot (the server may have taken more then five minutes to come back to a state where it could process PM commands), so right now, I don't think those need much attention, but one thing that did happen is of the machines that did get acted upon, there were some that, while they did install successfully, and PM said the post install reboot was initiated successfully, the machines actually did not get rebooted, so when users logged in, they were prompted with 'you need to reboot to finish installing updates'.
The fact that a number of machines in the groups did not get acted upon in any way is concerning, along with the fact that PM indicated that machines were successfully sent a post install reboot command did not actually reboot.
I saw a strange issue back when I was setting up groups where exactly half the machines I selected to add to a group actually got added (every other server got added), so I had to keep going back adding machines over and over until finally, they were all added. I don't know if this lack of action on servers in the group is in any way related to that.
I figured I'd post here to see if anyone has any thoughts before I call into support.
Does anyone have any thoughts or suggestions?