The thing about software is that it is written by imperfect humans. That was the last thought I had a 4pm Friday after most of the team had gone home for the weekend. I usually don't script, but then a coworker was tasked to reboot a few servers remotely on a segment. A week prior we had used a system to reboot all workstations with a particular software so he asked me to take care of it.


I logged in and created the job, but before I could hit save I started hearing someone on the help desk ask if I was rebooting his machine. I had not even saved at this point, and figured it was someone pulling a prank. Just to be sure I check the deployment logs, and sure enough the previous job I had setup started rebooting all workstations even though it had not been touched and was not scheduled. Not understanding what happened I canceled the job, and got out of the system.


About 5 minutes later the workstations started to reboot again. So I called the VM System admin, and had him disconnect the network on the server causing everything to reboot. This caused a lot of phone calls, but interestingly enough no one important knew until I called it in. Looking back a lot of updates finished installing properly after this mass reboot.


The following Monday I called up the software company and they identified a bug, that by simply visiting a page would kick off past software deployment packages. It took them about a month to fix it. This is why I looked into Solarwind's Patch Manager.


No Servers were harmed or rebooted in this incident.  


