After patch manager was setup with a daily schedule of blocks of servers I was tasked with automating both unmanaging and snapshotting of these servers prior to Patch Manager kicking off each night. I figured that others may find it useful to know how I accomplished this: Unmanaging This part was straightforward. I used the Solarwinds Unmanage Scheduler installed on the poller to create a script for each day containing the list of nodes. I then created a Windows task schedule template which I used to create one schedule per day to kick off unmanaging of the servers at 1am each morning and remanaging at 6am. Snapshots This request was a bit more of a head scratcher as our VM team stated that they are unable to script such a things. I wanted to do this in a way that any of our sysadmins could make changes to if necessary anyway so as we are running VMAN I settled on doing it via Solarwinds. I was given the additional instruction by the VM team that only one snapshot can be requested at any one time otherwise the ESX’s would get overloaded and cause the entire VM infrastructure go into meltdown. I accomplished this in the following way: I created a new alert and gave it a generic name followed by the day of the month. Properties Tab: I set it to evaluate the trigger every 2 minutes and made the severity ‘informational’. Trigger Condition Tab: For the trigger condition I just wanted something which would always be true so set the trigger to: Node->Node Name->is equal to and selected the Solarwinds poller. Reset Condition Tab: I’d calculated that the longest time we’d need to take all the snapshots per day would be under 90 minutes so set the alert to automatically clear itself after 90 minutes. Time of Day Tab: I selected to specify the time of day for schedule and created a schedule name specific to the task. Patch Manager has been set to begin at 1am each day so 90 minutes prior to that would be 11:30pm of the day before so I ticked all months in the schedule then selected the day before and specified to run between 11:30pm and 1am of the following day. Trigger Actions Tab: I entered a message to display of “Pre-Patching Snapshots have begun”. I also added a netperfmon entry stating the same for the first action. For the second action I selected ‘Manage VM - Take Snapshot’ and selected the first server on my list. As I could only take one snapshot at a time I calculated that we could safely run one every 4 minutes (my testing showed that snapshots taken this way are not instant and can take a couple of minutes to begin). So, I added an escalation level with a 4 minute wait. I then copied the previously created action and placed it in escalation level 2, edited it to change to the second server on my list and (this is VERY important) change the action title (see notes below). I repeated this to create enough escalation levels as I had servers. Reset Action Tab: Added a log entry of ‘Snapshots Complete’ Once saved I duplicated this alert for each of the remaining days and edited as appropriate. Snags : I initially created a title for each snapshot action for the time of day it would begin however after configuring several days I discovered that the action titles are reused and thus I’d been changing the actions in ALL of the alerts rather than just the one I was creating so in addition to the time I also added the date to make the titles unique. You could also use the server name but I wanted to make these actions reusable for future replacement hence the use of time/date.

This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.

You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

GIVING BACK TO THWACK - 2 (Automated Snapshots and Unmanaging)

osborne_graham over 6 years ago

After patch manager was setup with a daily schedule of blocks of servers I was tasked with automating both unmanaging and snapshotting of these servers prior to Patch Manager kicking off each night. I figured that others may find it useful to know how I accomplished this:

Unmanaging

This part was straightforward. I used the Solarwinds Unmanage Scheduler installed on the poller to create a script for each day containing the list of nodes. I then created a Windows task schedule template which I used to create one schedule per day to kick off unmanaging of the servers at 1am each morning and remanaging at 6am.

Snapshots

This request was a bit more of a head scratcher as our VM team stated that they are unable to script such a things. I wanted to do this in a way that any of our sysadmins could make changes to if necessary anyway so as we are running VMAN I settled on doing it via Solarwinds. I was given the additional instruction by the VM team that only one snapshot can be requested at any one time otherwise the ESX’s would get overloaded and cause the entire VM infrastructure go into meltdown.

I accomplished this in the following way:

I created a new alert and gave it a generic name followed by the day of the month.
Properties Tab: I set it to evaluate the trigger every 2 minutes and made the severity ‘informational’.
Trigger Condition Tab: For the trigger condition I just wanted something which would always be true so set the trigger to: Node->Node Name->is equal to and selected the Solarwinds poller.
Reset Condition Tab: I’d calculated that the longest time we’d need to take all the snapshots per day would be under 90 minutes so set the alert to automatically clear itself after 90 minutes.
Time of Day Tab: I selected to specify the time of day for schedule and created a schedule name specific to the task. Patch Manager has been set to begin at 1am each day so 90 minutes prior to that would be 11:30pm of the day before so I ticked all months in the schedule then selected the day before and specified to run between 11:30pm and 1am of the following day.
Trigger Actions Tab: I entered a message to display of “Pre-Patching Snapshots have begun”. I also added a netperfmon entry stating the same for the first action. For the second action I selected ‘Manage VM - Take Snapshot’ and selected the first server on my list.
As I could only take one snapshot at a time I calculated that we could safely run one every 4 minutes (my testing showed that snapshots taken this way are not instant and can take a couple of minutes to begin). So, I added an escalation level with a 4 minute wait. I then copied the previously created action and placed it in escalation level 2, edited it to change to the second server on my list and (this is VERY important) change the action title (see notes below). I repeated this to create enough escalation levels as I had servers.
Reset Action Tab: Added a log entry of ‘Snapshots Complete’
Once saved I duplicated this alert for each of the remaining days and edited as appropriate.

Snags: I initially created a title for each snapshot action for the time of day it would begin however after configuring several days I discovered that the action titles are reused and thus I’d been changing the actions in ALL of the alerts rather than just the one I was creating so in addition to the time I also added the date to make the titles unique. You could also use the server name but I wanted to make these actions reusable for future replacement hence the use of time/date.

0 furpho over 1 year ago

Hi. I realise this post is 5 years old but here goes. We don't currently own patch manager so the instructions above aren't familiar. We have around 250 servers on Vsphere to patch. Do you think your solution would work? Patch manager obviously doesn't have an integrated tool for patching and the above does sound a little convoluted and perhaps difficult to manage but only you could be the judge of that. We need a solution that has minimal admin and max reliability.
Cancel
Vote Up 0 Vote Down

Cancel
0 osborne_graham over 1 year ago in reply to furpho

I guess it would but TBH this was a bit of a workaround solution - although I was advised snapshots couldn't be automated through our Vcentre I later discovered that it could indeed be scripted and the solution I outlined above was discontinued in our environment some time ago in favour of native vmware scripts + solarwinds scheduled unmanagement. We have actually discontinued all snapshotting and unmanaging now as it was deemed unnecessary, relying on full backups which (touch wood) have had needed to be used and stats now record the time as outage rather than unmanaged.
Cancel
Vote Up 0 Vote Down

Cancel