Showing results for 
Search instead for 
Did you mean: 
Create Post

Master of Your Virtual IT Universe: Trust but Verify at Any Scale - Automation

Level 13

Master of Your Virtual IT Universe: Trust but Verify at Any Scale

A Never Ending IT Journey around Optimizing, Automating and Reporting on Your Virtual Data Center



Automation is a skill that requires detailed knowledge, including comprehensive experience around a specific task. This is because you need that task to be fully encapsulated in a workflow script, template, or blueprint. Automation, much like optimization, focuses on understanding the interactions of the IT ecosystem, the behavior of the application stack, and the interdependencies of systems to deliver the benefits of economies of scale and efficiency to the overall business objectives. And it embraces the do-more-with-less edict that IT professionals have to abide by.

Automation is the culmination of a series of brain dumps covering the steps that an IT professional takes to complete a single task. These are steps that the IT pro is expected to complete multiple times with regularity and consistency. The singularity of regularity is a common thread in deciding to automate an IT process.

Excerpted from Skillz To Master Your Virtual Universe SOAR Framework

Automation in the virtual data center spans workflows. These workflows can encompass management actions such as provisioning or reclaiming virtual resources, setting up profiles and configurations in a one to many manner, and reflecting best practices in policies across the virtual data center in a consistent and scalable way.

Embodiment of automation

Scripts, templates, and blueprints embody IT automation. They are created from an IT professional’s best practice methodology - tried and true IT methods and processes. Unfortunately, automation itself cannot differentiate between good and bad. Therefore, automating bad IT practice will lead to unbelievable pain at scale across your virtual data centers.

To combat that from happening, keep automation stupid simple. First, automate at a controlled scale following the mantra, “Do no harm to your production data center environment.” Next, monitor the automation process from start to finish in order to ensure that every step executes as expected. Finally, analyze the results and use your findings to make necessary adjustments to optimize the automation process.

Automate with purpose

Start with an end goal in mind. What problems are you solving for with your automation work? If you can’t answer this question, then you’re not ready to automate any solution.

This post is a shortened version of the eventual eBook chapter. Stay tuned for elongated version in the eBook. Next week, I will cover reporting in the virtual data center.

Level 14

Always have an end goal in mind.  Just like writing code, have an end product in mind.


Reminds of of the early days of Automated Operations back around 1990/91.  We took half a decade of operations experience across multiple platforms (Mainframe, tandem, vax [vms & unix], various other flavors of unix to include aix, hpux, sgi-irix, sco, bsd, solaris, etc., as/400, system 36, lan/wan gear, and so forth.  We automated the complex startup and shutdown process of the mainframe, processes to properly stop and start CICS, monitoring batch job abends to communication links between mainframe and Tandem as well as most repetitive operator duties.

One of the biggest challenges we had dealt with printer operation on the mainframe.  The print queue could have literally a thousand jobs waiting to print but certain jobs (by name) and print class had priority over all others.  The print operator had to spend time during the shift watching for those jobs, change the printer to the proper job class to print them, pull the off the printer and get them put out ASAP.  During month end processing it took 1 person the entire shift to watch the print queue.  Using the tools we had, a text based dataset I was able to automate the entire process for jobs that used standard paper and electronic flashes on the 3800 laser printers...the operator only had to look for a held message that a hot job had printed and they would pull the stack and burst it before sending it out.  It reduced needing 2 printer operators during a shift down to 1.  It reduced errors and got the priority print jobs out faster. 

Points kong.yang​ made about best practice and knowing inside and out what you are trying to do with a clear embodiment of what you are trying to accomplish is so onpoint.  TEST, TEST, TEST and while you keep it simple, you MUST have checks in your code to make sure you are working with valid data....and you MUST log steps for troubleshooting especially when suddenly you don't have expected valid input.

It is just an evolution of IT from before IT became cool.

Level 10

Nicely written. Will look for the next one about virtual DC.

Level 20

Not breaking things in the main production environment are pretty high up on the list for sure!  It's great if you actually have a decent dev environment... unfortunately that's not always the case.  The alternative is rather than running something on everything find a way to start small a limit the damage in case things do wrong... and it will sometime.

Level 13

I agree with network defender​ have an end goal.  If you don't you are just throwing away good money. 

I am curious to hear what manual tasks people, who are not in the Managed Services space, have taken the time and effort to automate in their virtual datacenters. More so, I am really interested in automation that is not focused on Incident response.

Level 21

Before you have automation you need to have standardization.  I see a lot of people immediately try to get on the automation train without taking the time to first standardize their environment.


Standardize the environment = normalized data

For example if you have 8 different ways to name a server, then automation becomes more challenging and the risk of issues increases.

Level 21

Ugg, the age old naming convention conversation.  Every time it comes up it seems to become a holy war.  I have never understood why people in this industry are so passionate about having specific naming conventions.  As long as it works that's all that matters, am I missing something there?


when it comes to automation it is easier to code for a known naming convention instead of 6 or 8 or more different naming conventions.

Then by using regular expressions you can easily get the required subset based on server name.  For example when there are 4 generations of different windows naming conventions in place as well as 2 different unix and yet another convention for international servers...

Level 21

Jfrazier​ I like your automation approach to looking at naming conventions.

I recently proposed here that we move away from our standard naming convention wherein each different device type begins with 3 letters representing that device type and instead move to something more like a serial number.  The argument was that then we can't look at a name and know what it is; however, in every situation we need to look the device up for more information anyway so having the name provide info about the device doesn't really seem relevant.  We manage so many different systems that at this point the only value the name provides is something to use to look up more details on the device.  If we moved to a serial number then automation becomes a much easier thing and scales across all device types.

Level 13

I hate naming conventions...but they are necessary.


byrona​ I understand your point and agree on looking up device information.  Depending on what sort of automation you are building that may or may not be relevant.  I think being able to look at the name and know the following is more relevant..

server, network device, F5, etc

prod, dev, dr, SIT, model, qa, test, what ever flavor of the month environment

application, db, web server, etc.

gets you a long way at first glance (human) and allows for better alerting rules in some cases/environment.


I'd one more item to the naming convention...a criticality indicator.

1 for critical path

2 for important

3 for supporting role

We have a lot of this built into custom properties which is great for the Orion environment but that data is not there for other automation tools that run outside of Orion.

There's only one graphic that properly shows the benefit of automation:


About the Author
Mo Bacon Mo Shakin' Mo Money Makin'! vHead Geek. Inventor. So Say SMEs. vExpert. Cisco Champion. Child please. The separation is in the preparation.