Skip navigation

Extending IT Experience

Posted by kpe Jan 31, 2018

Working in IT? Feeling lost given everything that’s going on in the industry? Are you confused about what to focus on and how to best apply your knowledge?


If you can relate to any of the above, this blog post is for you!


These days, the lines between job functions are growing increasingly blurrier. New topics and technologies are constantly evolving, so how do you stay ahead, or at least on top of your game?


First, understand that staying ahead in the industry is a never-ending journey, which means you will never be “done.” You will reach some career milestones, but there is no end to reach.


Second, pick one area of interest in which you would like to improve. For me, that’s network automation, or learning to develop software to assist with daily tasks. Don’t start with a huge topic; keep it manageable. For example, I would not start out with “data center” and try and understand everything under that huge umbrella. Instead, you could start out with: “Data center to WAN connectivity,” which narrows it down quite effectively. The former would take several months and a reading list of 5-10 books, whereas the latter boils it down to maybe three to four weeks and a reading list of one or two books.


After you have picked an area, be very single-minded in your focus. Don’t jump back and forth between different tracks that you would like to improve on. This will only make you feel discouraged because you’ll likely feel you aren’t making any progress. This discipline has helped me to focus and improve certain skills much faster than if I would have done a bit of this and a bit of that randomly.


This approach also has the side effect of helping to sort out your mind, so to speak. You will feel more in control, which will help reduce the nagging sensation you get when you are feeling lost.


Through my experience in IT, there are two truly effective ways of choosing a topic to study. First, pick a topic that is related to something that you already do on a regular basis. The advantage of this is that you will actually get to use your improved skill for something practical. At some point, it might even free up some time to start on my next suggestion.


Once you feel you’ve improved enough in one area of interest, pick a topic that you would like to extend yourself into. This will keep your skill set sharp so that you will be prepared for the next big thing coming on the horizon. It also has the added benefit of being something you are really passionate about, which will make it easier to fully focus on it.


Now that you have picked out something to improve on, how do you actually go about it? Well, as mentioned earlier, single-tasking can be quite helpful in IT. But before you get to that, I would suggest you map out some time slots during the day that is specifically dedicated to study. If you don’t have time during your normal work day, try and do it before you leave for work in the morning.


If you have issues with procrastination, try and start really small. Try reading for just 5-10 minutes. It might not be much, but it’s better for your career than spending that time sleeping.


Keep a journal or a list of your progress. I note what I have studied and for how long in my calendar. This has the benefit of providing me with more motivation when I look back and see how I’ve performed during the week.


Also, try and mix things up if you get stuck doing only one thing (reading, for example). Shake it up by watching some videos on the topic instead.


Finally, to gain IT experience, I would advise you to read popular blogs, news sites, Twitter®, etc. Just dive into the ones you find relevant to your chosen topic. This will help create a mental picture of what’s going on in the industry so that you’ll have them fresh in your mind. You have to be very careful not to take in too much information that is irrelevant to your current topic. By all means, be curious about topics that relate, but be mindful of your mental bandwidth.


I hope this information will give you a sense of purpose and rejuvenation in your professional life. I know it’s a system that has worked for me, so I trust others will be able to use it as well.


If you have any questions, feel free to reach out and I will do my best to help!


Thanks for reading!

For a lot of organizations, moving to Office 365 might be one of the first, or possibly biggest, migrations to a public cloud. There can be a lot of stress and work that goes into the process, with the hope that the organization will reap the benefits. But how can you be sure that you are, in fact, getting the most out of Office 365?




We typically don't carry out IT projects just for the heck of it. Enterprise IT departments almost always have a backlog to deal with, so deploying new technology just for the sake of it isn't very high on the list. Rather, most priorities are ordered by the problems that they solve or the value that they bring. Moving to Office 365 is no different. If you find yourself looking at making the move, hopefully, you have a list of perceived value it will bring.


Sitting down with business leaders is a great starting point. Ask them what--if any--pain points they have. Maybe it is the lack of availability. Do they always need to be using a corporate-issued laptop to access email or work documents? Would SharePoint online solve that? Another common complaint that I have seen is older software. Sure, 90% of the features from Word 2003 are the same in Word 2016, but that doesn't mean everyone wants to use 13-year-old technology. In some cases, it can even be for perception. If a salesperson shows up to close a big deal and they are running Office 2003, how would that look? They certainly would not come off as a cutting-edge company. Subscription-based licenses from Office 365 can solve this and ease the burden of managing spreadsheets full of license info for IT departments.




You've decided that the move makes sense. Great! What challenges do you foresee? This step is critical as there is almost always some cost associated with it. It might be soft costs, such as time from your salaried IT department. Or it might be hard costs; maybe you are looking at performing a hybrid installation and you'll need to increase your bandwidth costs.


How about regulations? Do you need to make sure some data stays on-premises (i.e. financial data) or is it all safe to move to SharePoint? If the former, how do you plan to track and enforce compliance? There are tools built into Office 365 for compliance and security, but will it be a challenge to get IT staff trained on them?


Another common challenge is user training. Lots of options exist for this, ranging from Microsoft-provided materials to possibly doing lunch and learn sessions with groups of employees over time. As most folks who have help desk experience in IT know, sometimes a small number of users can account for the majority of support time.




Now that you know what value you will be gaining, and the potential challenges, you need to do some math. Ideally, you can account for everything (monthly costs, infrastructure upgrades, lost productivity, etc.). Even better if you can assign dollar figures to it. Once you have that, the decision should become easier. Are you saving money? If so, how long will it take to reach your ROI? Are you going to end up spending more on a monthly basis now? Is the value worth it? Maybe your sales staff will be more agile and close more deals, increasing revenue and profit.


This is by no means a comprehensive list for such a big project, but it should be a good starting point. Do you have any tips to share? Maybe you've run into some unexpected issues along the way and can share your experiences. Feel free to leave comments below!


This past Sunday was Data Privacy Day. I'm guessing most of you didn't know such a day existed. It would seem that the first rule of Data Privacy Day is we don't talk about Data Privacy Day. Here's hoping we can get those rules changed before next year.


As always, here are some links from the Intertubz that I hope will hold your interest. Enjoy!


Hating Gerrymandering Is Easy. Fixing It Is Harder.

Because I enjoy when data is used to help explain the problem and to explain why there is no perfect solution.


Thieves Jackpot ATMs With ‘Black Box’ Attack

ATM machines aren’t known for being up to date. I’m surprised that this exploit isn’t more widespread.


Windows 10 can now show you all the data it’s sending back to Microsoft

Just in time for Data Privacy Day, Microsoft is letting users know details about the telemetry they collect.


Tracking app Strava reveals highly sensitive information about U.S. soldiers’ location

Also just in time for Data Privacy Day, Strava shows that neither they nor their users understand the dangers involved with something as innocent as tracking a jogging route.


Cybersecurity should be a boardroom topic, so why isn’t it?

Money, mostly.


Personal Data Representatives: An Idea

I think if the author mentioned the word “blockchain” in this article he would already have funding for this wonderful idea that I do hope happens in my lifetime.


Burger King Trolled Customers to Perfectly Explain Net Neutrality

In case you haven’t seen this video yet, enjoy.


This is how I celebrate Data Privacy Day, by drilling holes into my old hard drives:

By Paul Parker, SolarWinds Federal & National Government Chief Technologist


Security is always an important topic with our government customers. Here's an applicable article from my colleague, Joe Kim, in which he offers some tips on compliance.


Ensuring that an agency complies with all of the various standards can be a job in itself. The best strategy is to attack the challenge on three fronts. First, proactively and continuously monitor and assess network configurations to help ensure that they remain in compliance with government standards. Second, report on their compliance status at any given time. And third, beef up their networks with rock-solid security and be prepared to quickly remediate potential issues as they arise.


Automate network configurations


One of the things agencies should do to remain in compliance with the RMF, DISA STIGs, and FISMA is monitor and manage their network configuration status. Automating network configuration management processes can make it much easier to comply with key government mandates. Device configurations should be backed up and restored automatically, and alerts should be set up to advise administrators whenever an unauthorized change occurs.


Be on top of reporting


Maintaining compliance involves a great deal of tracking and reporting. For example, one of the steps in the RMF focuses on monitoring the security state of the system and continually tracking changes that may impact security controls. Likewise, FISMA calls for extensive documentation and reporting at regular intervals, along with occasional onsite audits. Thus, it is important that agencies have easily consumable and verifiable information at the ready.


The reporting process should incorporate industry standards that document virtually every phase of network management that could impact an agency’s good standing. These reports should include details on configuration changes, policy compliance, security, and more. They should be easily readable, shareable, and exportable, and include all relevant details to show that an agency remains in compliance with government standards.


Catch suspicious activity and automate patches


Agency IT administrators should also incorporate security information and event management (SIEM) to strengthen their security postures. Like a watchdog, SIEM alerts for suspicious activity and alerts when a potentially malicious threat is detected. The system can automatically respond to the threat in an appropriate manner, whether that is by blocking an IP address or specific user, or stopping services. Remediation can be instantaneous and performed in real-time, thereby inhibiting potential hazards before they can inflict damage.


Implementing automated patch management is another great way to make sure that network technologies remain available, safe, and up to date. Agencies must stay on top of their patch management to combat threats and help maintain compliance. The best way to do this is to manage patches from a centralized dashboard that shows potential vulnerabilities and allows fixes to be quickly applied across the network.


Following the guidelines set forth by DISA®, NIST®, and other government acronyms can be a tricky and complicated process, but it does not have to be that way. By implementing and adhering to these recommended procedures, government IT professionals can wade through the alphabet soup while staying within these guidelines and upping their security game.


Find the full article on our partner DLT’s blog Technically Speaking.

Let's normalize what GOAT stands for. In this instance, it stands for the greatest of all time. The GOAT Doctrine states that IT professionals, regardless of experience and expertise, should always be moving forward. Such a concise, vague, and all-encompassing statement, yet so profound in its meaning for a true professional.


The only constant in IT and technology is change. That change sometimes seems circular as old technology morphs into new technology that transforms back into a reincarnation of old technology. Other times, this change is reflected as people flow in and out of our professional purview. This change flows in with more velocity, volume, and variety. What can you do to survive and thrive in your IT career?


Simple. Aim to be the GOAT you. GOATs continue to adapt, evolve, and most importantly, move forward. They move forward to gain new experience and expertise while leveraging the best of what they already have. To be a GOAT, you have to beat the GOAT. That GOAT is the past you. So take up that challenge in this New Year and become the better GOAT version of yourself.


In the comment section below, share your GOAT stories as you power leveled yourself and became a better GOAT you.

Flood of Kobolds vs. Epic Boss Fight

I had a group where I was running a dragon-themed adventure.  Before you gasp and say mockingly, “In Dungeons & Dragons, there are actually dragons?” let me tell you that that was just thematic (as far as my players knew).


Part of this was making Kobolds be in a frenzy because they were acting on the orders of their dragon overlord.  For those unaware, Kobolds are the cannon fodder of the D&D world.  The party was repeatedly beleaguered by hordes of them with staged battles (archers then infantry then mages).  Long story short, each battle felt like a big fight with the difficulty ramping up each time.  Unfortunately for my players, these were just the opening volleys.

The big battle was against a young dragon.  Don’t let the word “young” distract you from the word “dragon.”  This wasn’t an easy fight.  Sadly, it was nearly a total party kill (TPK) for the group because they got overconfident based on their previous encounters.  Confidence is great – overconfidence less so.

I’ve seen the same thing in IT.  Technicians get cocky thinking that they know everything and bite off more than they can chew.  This is typically something that green IT people do, but they don’t have exclusive rights to this failing.  The thought that “in a perfect world it will go just like this” falls on its face when you realize that there’s no such thing as a perfect world.  New IT people sometimes don’t have the good sense to think, “maybe this isn’t just one-degree harder than the previous thing I did.”  If there’s a lesson to be learned here, it’s that you cannot win every fight with the same skills you already have.  Learn new ones and evaluate your scenarios before rushing in.


We get to the final question here, “Has running a D&D game actually provided me with any life skills?”  The answer is a resounding “Yes!”  So much so, that I would have no issue putting a few things on my professional resume.

·         Meet with peers for scheduled creativity and conflict resolution exercises

·         Assisted multiple people with gathering experience for both character and skill development

  Learned to quickly assess situations and collaborate on best solutions

Can you have too much of a good thing? Maybe not, but you can certainly have too much of the wrong thing. In my first blog, I introduced the idea that Microsoft event logging from workstations can be a simple first step to building a security policy that looks beyond the perimeter. The simplicity comes from the fact that event logging is part of a workstations OS, so no need to acquire additional applications or agents.


As many of you commented, the more difficult part is filtering out the stuff you don’t need. Fortunately, this is also the fun part.


By creating focused filters and sets of important events, the admin is required to understand what different event IDs mean and how they relate to the overall security of the organization. Being proactive and reactive to threats requires knowledge and at times creativity. You also need to monitor and tune your workstation event log policies to ensure they are providing the appropriate level of coverage whilst not overwhelming a system from a performance standpoint.


So what are the must-haves when it comes to event logs?  Here are some of the critical ones for a workstation (non-server) policy.


  • NEW PROCESS STARTING:  4688 - logs when a process or executable starts.
  • USER LOGON SUCCESS:  4624 - tracks successful user logons to the system.
  • LOGIN FAILURES:  4625 - can be used to detect possible brute force attacks
  • SHARE ACCESSED:  514 - helps track user access to file shares.
  • NEW SERVICE INSTALLED:  7045 - will capture when a new service is installed.
  • SERVICE STATE CHANGES:  7040 - can indicate a service has been modified by a hacker
  • NETWORK CONNECTION MADE:  5156 - works with Windows Firewall to log the state of a network connection (source, destination, ports and process used).
  • FILE AUDITING:  4663 - logs an event when new file is added, modified or deleted.
  • REGISTRY AUDITING:  4657 - will capture when a new registry item is added, modified or deleted.
  • WINDOWS POWERSHELL COMMAND LINE EXECUTION:  500 - will capture when PowerShell is executed and log the command line.
  • WINDOWS FIREWALL CHANGES:  2004 and 2005 - will capture when new firewall rules are added or modified.
  • SCHEDULE TASKS ADDED:  106 - will capture when a new scheduled task is added.
  • USER RIGHT ASSIGNMENT:  4704 - documents a change to user right assignments on this computer and will show privilege escalations.
  • USER ACCOUNT CREATED:  4720 - tracks accounts created on the local machine.
  • USER ACCOUNT DELETED: Event Code 4726 - alerts if a local account is removed.
  • USER ACCOUNT LOCKOUT: Event Code 4740 - reports accounts locked due to failed password attempts.


The threat landscape is always changing. Attacks have a lifecycle and in the end they are either remediated by our virus and malware products, or they morph into the next big thing.

Once you have an event policy in place, it’s important to monitor the latest security issues in order to assess the applicability of your policy.


A few points to consider are:

  • Does your event set still help detect and remediate attacks?
  • Has there been a change to endpoint configuration (a new agent for example) that may be creating duplicate event logs?
  • As the OS changes over time, are new events available that should be introduced?
  • Should I tune my parameters or deprecate event IDs that are no longer relevant to avoid taking up space?
  • Are there compliance mandates that dictate a change in policy?


The moral of the story is, when it comes to security, you are never really done.


In my next post, we’ll take a look at building an overall strategy for creating filters and deploying your event log policies. We'll also identify some events that are worth tracking relating to specific applications and services.


One great reference for Windows Event IDs:

Short-Circuiting your Adventure

I’ve noted that preparation and planning are some of the hallmarks of both a good DM and IT professional, but sometimes planning gets short-circuited.  In the past, I’ve spent hours working on a campaign with intricate details, a slowly building storyline, interesting character interactions, and a clear path from point A to B to C and so on.

Then we sit down to play this epic tale, and the players choose to follow the white rabbit instead of the obvious path in front of them and jump immediately to point J.  That’s it.  A small change and they’ve completely gone off course.  In the past, I’ve had this go one of two ways: the group takes a petty diversion and makes it more than it’s supposed to be or they’ve jumped ahead in the story so much that I don’t have anything else planned.  As the DM, you can either throw up your hands and walk away, force them back on the “right” path, or see where the adventure leads.

One of those avenues reminds me of scope creep in IT Projects.  You’ve outlined everything in a beautiful waterfall plan and the team starts taking side-trips and tacking on new requirements.  The adventure of discovery is upon you!  Keep going!  Let’s see where this will end up.  Will planning to only swap out a power supply lead to the adventure of replacing all the UPS’s in a data center?  Who knows?  Mystery abounds.

The clear answer here is “within reason.”  You’ve planned one change and now people are adding more and more to that change request.  Where do you draw the line?  I can’t answer that for you, because it’s different for every scenario, but you should be open to change.  Embrace it where you can and push it off where you cannot.

The flip side of this scope creep is having your entire plan thrown out and being asked to “skip to the end.”  Just like in D&D this leaves you asking, “what’s next?” and not having a clue.  Maybe you’ve only planned for five steps and you need to jump right to step six.  Can you plan for this?  Probably not.  About the only thing you can do it plan for unexpected.

Looking back, the same solution presents itself for each problem: plan for the unexpected.  This applies to the players, the DM, and the IT Professional.  There’s always going to be another tree in the forest, and there might be a goblin archer behind one.  Plan for these diversions, but don’t let them pull you from your goal.

Business services and infrastructure services have divergent interests and requirements: business services are not focusing on IT. They may leverage IT, but their role is to be a core enabler for the organization to execute on its business strategy, i.e. delivering tangible business outcomes to internal or external customers that help the organization move forward. A business service could focus on the timely analysis & delivery of market data within an organization to drive its strategy, another business service could be to allow external customers to make online purchases.


Infrastructure services will instead focus on providing and managing a stable and resilient infrastructure platform to run workloads. It will not necessarily matter to the organization whether these are running on-premises or off-premises. What the organization leadership expects from infrastructure services (i.e. IT) is to ensure business services can leverage the infrastructure to execute whatever is needed without any performance or stability impact.

Considering that the audience is very familiar with infrastructure services, we will focus the discussion here on what business services are and what makes them so sensitive to any IT outages or performance degradation.


Business services, while seemingly independent, are very often interconnected with other organization IT systems, and sometimes even with third-party interfaces.  A business service can thus be seen (from an IT perspective and at an abstract level) as a collection of systems expecting inputs from either humans or other sources of information, performing processing activities and delivering outputs (once again either to humans or to other sources of information).


One of the challenges with business services lies with the partitioning of its software components: not everybody may know the “big picture” of what components are required to make the entire service/process work. Within the business service, there will be usually a handful of individuals who’ve been around long enough to know the big picture, but this may not always be properly documented. The impossibility, inability or even lack of awareness that upstream and downstream dependencies of an entire business service must be documented properly is often the culprit to extended downtimes with laborious investigation and recovery activities.


In the author’s view, one of the ways to properly map the dependencies of a given business service is to perform a Business Impact Analysis (BIA) exercise. The BIA is interesting because it covers exactly the business service from a business perspective: what is the financial and reputational impact, how much money would be lost, what happens to employees, will the organization be fined or even worse have its business license revoked?


Beyond these questions it also delves down to identifying all of the dependencies that are required to make the business service run. These might be the availability of infrastructure services and qualified service personnel, but also the availability of upstream sources such as data streams that are necessary for the business service to execute its processes. Finally, the BIA also looks at the broader picture. If a location is lost because of a major disaster, perhaps it makes no longer sense to “care” about a given business service or process, when priorities have now shifted somewhere else.


Depending on the size of the organization, its business focus and the variety of business services it delivers, the ability to map dependencies will greatly vary. Smaller organizations that operate in a single industry vertical might have a simplified business services structure and thus a simpler underlying services map, coupled with easier processes. Larger organizations, and especially regulated ones (think of the financial or pharmaceutical sectors, for example), will have much more complex structures which impact business services.


Keeping in mind the focus is on business services in the context of upstream/downstream dependencies, complexities can be induced by the following:

  • organizational structure (local sites vs. headquarter)
  • regulatory requirements (necessity to take into account in business processes the requirement to provide outputs to their regulatory body)
  • environmental requirements - production processes depend on external factors (temperature/humidity, quality grade of raw materials, etc.)
  • availability of upstream data sources & dependency on other processes (inability to invest if market data is not available, inability to manufacture drugs if environmental info is missing, inability to reconcile transaction settlements etc.)


In these complex environments, the cause of a disruption to a business service may not be immediately evident and therefore an adequate service mapping will help, especially in the context of a BIA. Needless to say, it may not always be an easy walk in the park to get this done, especially if key members in the organization which were the only ones to understand the full context are gone. It might even be much worse in the case of a disaster or an unfortunate life incident (the author has experienced this in at least two organizations).


What about IT / infrastructure services, and how can they help with the challenges of business services? It would be wrong to assume that IT is the panacea to all problems and the all-seeing-eye of an organization. There is however a tendency to assume that because business services execute on top of infrastructure services, IT has an all-encompassing view of which application servers are interacting with which databases, and this leads organizations to believe that only IT can fully map a business service.


The belief holds partially true: IT organizations that leverage advanced monitoring solutions are able to map a majority of infrastructure/application dependencies and view traffic flows between systems. In our view, these solutions should always be leveraged because they drastically improve the MTTR (Mean-Time-To-Resolution) of an incident. Nevertheless, in the context of a BIA and of the business view of services, we believe that while IT should definitely be a contributor to business service mapping, it should not be the owner of such plans. The full view on business services requires the organization not only to incorporate IT’s inputs, but also to gather the entire process flow for any given business process, to understand which inputs are required and which outputs are provided, as those may not always end in an handshake with an IT infrastructure service process.

Had a great time in Austin last week despite the ice storm. It's been a while since I've had to scrape ice from a windshield with my bare hands. I'm glad that my New England survival training skills are still intact. 


As always, here are some links from the Intertubz that I hope will hold your interest. Enjoy!


Google Memory Loss

Just in case you were worried about the loss of net neutrality leading to censorship, here's a little piece to remind you that censorship comes in many forms. Google doesn't care about helping you find data, they only care about getting the right ads delivered to your browser.


Google Cloud is Adding 5 New Data Centers, Rolling Out 3 New Subsea Cables

Am I the only one that read the title and thought "Google has a cloud"? OK, I *know* they have a cloud, and I like seeing that they are investing in expanding their reach. But this article makes it clear that Google is far behind AWS and Azure, and that it is not likely we are going to see anyone make the necessary investment to compete with those three (no, not even Oracle).


Internal documents reveal that Whole Foods is leaving some shelves empty on purpose

I guess this is what happens when someone that runs a bookstore thinks they can run a grocery store, too.


Jason's Deli: Hackers Dine Out on 2 Million Payment Cards

Because it's been a while since I reminded you that everything is awful, and that corned beef and swiss cost you more than you thought.


Intel Confirms Fresh Spectre, Meltdown Patch Problems

Everything Is Awful™


Throne AI is the Kaggle of Sports Predictions

OK, maybe it's just me, but I think this idea is fantastic. It reminds me of how my interest in Fantasy Football 20+ years ago paved the way for me to take a deep interest in data, databases, web development, and applications. The gamification of predictive models is brilliant, because right now there is someone out there, starting out with machine learning, and will use this as a starting point for their career. And I think that's awesome.


Social media is making you miserable. Here's how to delete your accounts.

Here's a list to help remove yourself from the major social media websites. In case, you know, you've thought about doing that once or twice already.


Dear Austin Hotel: I appreciate the effort with the sand last week, but this isn't how you spread sand, and beach sand isn't as helpful as you might think:


This is the second post in a post that started in Part the First.

CRIT! Unplanned Good/Bad

Randomness is a part of life.  This has widespread areas of affect: from human interactions to chaos theory.  To make Dungeons & Dragons more lifelike, it needs to inherit this randomness from real life.  This is where the dice come in.  The one that’s used most often is the 20-sided die (referred to as a d20).

On any roll of a d20, you have a 5% chance to get any number.  The best and worst rolls being a 20 or a 1, respectively.  Anyone has a 5% chance of hitting either result (thanks math!).  In D&D these get special connotations: they are called Critical Rolls.  So, a 20 is a Critical Hit and a 1 is a Critical Fail.

Critical Hits in IT

Sometimes you or your team just hit something out of the park.  Yes, I’m mixing metaphors here, but you get the point.  You didn’t plan for this task to go as well as it did.  You’ve moved 1,000 mailboxes during the night with no downtime.  Your team executed a zero-downtime upgrade of a SQL cluster.  Your coworker applied configurations to 200 WAN routers with no blips.  This a Critical Hit!

On the chance that you’ve encountered one of these rare events, be sure to celebrate.  Do a little dance, take everyone out for a lunch, let your management team know about it, whatever you do to mark accomplishments.

Then again, there’s the other side of the coin die.

Critical Fails in IT

I hate to say it, but in Information Technology, it always feels like there’s more probability of getting a 1 on a d20.  I stated earlier that in the game, you have a 5% chance to get a critical failure, but real-life IT probability feels skewed towards failure instead of success.

In game when you have a critical failure whatever you are trying to do fails… epically.  You go to push a troll off a bridge and instead you lightly caress his shoulder.  You both feel awkward.  This is an epic failure and the scope of it is bound to DM discretion.

In IT, epic failures take different forms.  They can be something as simple as turning off the wrong port on a switch or as great as crashing a mail server.  An IT department doesn’t have a DM to choose what happens when things go wrong.  Instead we rely on our own knowledge and the experience of others to help guide us on how to proceed.  Quick thinking and decisive action are key parts to following up after a failure, but the best thing you can do is communicate.

Communicate with the department and the affected parties.  Clean, clear communication of the issue and your plans for recovery are the first, best thing you can do after getting an epic failure.  Here’s where every member of IT gets to be their own DM.  It’s up to you to decide the next move.  Make it a good one with transparency.

Hybrid IT can create issues for IT operations. Here's an article from my colleague, Mav Turner, that suggests ways to keep things running smoothly.


Trying to find the root cause of IT problems can often feel like looking for a needle in a haystack. Worse, there are often multiple haystacks, and sometimes the haystack you need to search is located on a completely different farm.


Issues may exist on-premises in their complex application stacks, or they could exist far away, somewhere in the cloud. Without visibility into all aspects of the network, it can be very difficult to tell where the problem lies, and finding those proverbial needles can be nearly impossible.


Today, many federal network managers do not have the ability to continuously monitor both on- and off-site environments. Tools used to monitor what happens on-premises will not necessarily pick up all of the interactions throughout hybrid infrastructures.


A hybrid IT canvas


Today’s federal IT managers need a network view that is broad and expansive and offers visibility into resources and applications, including virtualization, storage, applications, servers, cloud and internet providers, users, and more. Managers must be able to see, correlate, and understand the data being collected from all of these resources and share it with their colleagues.


Network managers must also deploy methods that allow them to compare data types side by side to more easily identify the cause of potential issues. Timelines can be laid on top of this information to further identify the cause of slowdowns or outages. For instance, a manager who is alerted to a non-responding application at 11:15 a.m. can review the disparate data streams and look for warning signs in those streams around the time that the issue first occurred. Managers can share these dashboards with their teams to get everyone on the same page and verify that the problems are resolved quickly.


Dependency mapping can be critical in complex environments where one application depends on another. Unlike traditional IT, dependencies are highly dynamic in a cloud environment. Databases can move around, and containers can pop up and disappear. Being able to quickly and automatically identify dependencies and the impact that events can have on connected resources—whether on-premises or hosted—can save precious problem-solving time.


A window into the future


Reacting to an incident is usually more time- and resource-intensive than preventing the problem in the first place. It’s far better to use predictive analytics to avoid the issues altogether. By collecting and analyzing all of the aforementioned network and systems data, federal IT managers can better predict when capacity problems or failures may happen and take steps to mitigate issues before they occur. Based on trends, anomalous patterns, and other algorithms, managers can be alerted prior to an event, receive insight into its potential impact, and advice on how best to react.


For example, at some point in the past, an agency may have experienced an issue with CPU and memory being oversubscribed on a set of virtual machines. If the events that led up to that issue recur, the manager can receive recommendations on how to address the problem before it becomes a real concern. Those recommendations could include relieving memory problems or high CPU usage by moving a VM from one host to another, allowing IT managers to optimize workloads and avert problems.


A network that runs smoothly


One of the primary jobs of any federal IT manager is to keep their network running smoothly so the user experience does not degrade. Sometimes that involves sorting through increasingly complex hybrid IT environments to find that one little needle. Managers must discover and implement new ways to gain complete network and system visibility and continuously monitor all of their resources.


Find the full article on Government Computer News.

So, here’s the first confession: I’m an über nerd.  I’ve been playing Dungeons & Dragons (D&D) since I was about 16 years old.  It was also about that time (possibly coincidentally) that I also became engrossed with computers.  Not just watching my fake family die from dysentery, cholera, snake bites or drowning, but actually how they worked and why this was important.

In High School, I had a job with a few good friends and still had some free time.  What’s a teenager to do with some free time, friends, evenings off, while still maintaining a clean criminal record?  We decided to give this Dungeons & Dragons thing a try.

Defining Roles

Part of the deal with a new D&D group is punishing someone deciding who will be the Dungeon Master (the “DM”).  Where everyone else gets to read parts of one book, the DM gets to read that whole book and two others.  Your job as DM is to provide the game bounds and guide the characters on their adventures. Done well, it’s like collective story telling with some randomness added.

Looking back, the parallels between working in Information Technology and running a D&D campaign have striking similarities.

Party Balance

Confidence and a little bit of showmanship are key attributes of a DM.  You are the center of attention most of the time as you weave the story and outline possible paths.  Understanding and enforcing the rules is important, but even more vital is keeping the players working together.  This can be especially difficult based on their experience with the game, the varying personality types, and the jobs or alignments they’ve chosen for their characters.

Thinking back, this isn’t any different than working with an IT Team.  There are going to be people who know more than others, each will different skills, and their personalities can be just as varied.  Leading a team like that can be frustrating, but also incredibly rewarding.  The different skills create a great “party balance” and the different levels of experience offer multiple perspectives when solving a problem.

Come to think of it, creative problem solving is also something that crosses the boundaries of Dungeons & Dragons and Information Technology.  Given a situation (dragon in a tower or storage array tray failure) you need to work together to try and find a solution.


Some would argue that being a Dungeon Master is only as difficult as taking the time to do the reading and planning.  Sure, you can pick up a pre-written module, but you still need to read through it and prepare.  Planning is key as it keeps the process (or story) moving forward.  Keeping tabs on various players, settings, and key items is just as important as getting updates on teammates, inventory, and project status.

My past, in both IT and D&D, have influenced each other in more ways that I can count.  In a game a few years ago, my wife was playing as an elven bard.  Basically, her role in the party was to poke fun at the other players and keep things moving forward.  If the players were sitting on their hands debating opening a door, she would kick it in and keep the game moving forward.  Whether she knew it or not, she was acting as my Project Manager – keeping the process moving.

Check back for part 2 soon.


It's time for another edition of Leon's Log, where either I preview a trip I'm about to take, or summarize one I've just been on. My goal is to help those whose budgets don't allow them to attend these conventions to get at least a few insights into what was shown; and for those who are considering going, get a window into the value of the event.


Taking a break from Berlin (where it's been held the last two years), CiscoLive Europe (or #CLEUR, as you'll see it mentioned on Twitter and elsewhere) will be in Barcelona, Spain this year, running from Sunday (yes Sunday) 1/28 through Friday 2/2.


Barcelona in February is hardly an oasis, but staring out my office window at the snow-covered streets of Cleveland (Jenne Barbour now insists I live on Hoth), the average Spanish weather of 5°-14°C  (about 41° - 57° F) is definitely a step in the right direction!


I mentioned the convention opens on Sunday this year, with a dedicated set of  DevNetExpress classes ( My travel plans don't allow me to make it for that, and I'm definitely going to miss it. Despite that, I am hoping to hit at least one programming session because everyone looks like they're having so much fun every time I go into the DevNet Zone.


One of the things I've been lax about keeping up to date with are Cisco's SDA and SD-WAN strategies. I feel like this is the year I should hit some of the sessions on this. I've even been invited to swing by the dCloud booth, crash on the dCloud couch, and get a demo of DNA Center and Viptela.


Another thing I'm looking at are the keynotes and showcases, to see if I can suss out any major themes.

  • The keynote will be given by Rowan Trollope SVP and GM, IoT and Applications
  • Innovation Showcases titles include:
    • Unlock the Power of Data
    • Reinvent Networking
    • Changing the Security Equation
    • Delivering Intent for Data Center Networking
    • Enabling a Multicloud World
    • Emerging Technologies are Game-Changers for Technology Services
    • Rise of the Network APIs
    • The Rise of the Team: Speeding up Work in the Disruptive Economy
    • Transformation Through Innovation
    • Unlock the Value of IoT Data


I'm pretty sure there are some underlying messages, right?


While I know I said this last time I'm looking forward to finally FINALLY getting NetVet status this year. I've gotten my email asking to confirm my past attendance, so I'm keeping my fingers crossed that this is my year to get the coveted red ribbon.


I'm going to miss seeing my long-time convention buddy Roddie Hasan (@eiddor), but he had to skip out on this event. Not to worry, we already have plans to catch up at CiscoLive US in Orlando in June.


AND OF COURSE, going to Spain means I have a chance to sample the culture and cuisine. While traditional paella and tapas may not be on my #kosher menu, there are a few restaurants in the city that have options, and I'm planning to share pictures of everything I can sink my teeth into.


Finally, I'm doing something fairly unique for these kinds of trips: You see, my wife was born in Seville while my father-in-law served in the air force. So this is a chance for me to bring her "home" and visit the first house where she laid her head at night. While we're at it, we'll try to take in as much of the country as time will permit.


¡Y también podré practicar mi español!


If you are planning to attend CLEUR, please drop me a line and plan to stop by booth WEP 1A to say hi, talk monitoring, and of course grab some of the usual slate of convention goodies.


On modern enterprise networks, 100% uptime has become table stakes. Most organizations can no longer rely on a single circuit for internet connectivity. We look to carrier circuits for the redundancy and guaranteed uptime that our organizations need. When carrier outages occur, network engineers find themselves in a hot seat they can do little about. However, if we do our homework, we can improve our organizations' uptime by taking care as we provision connectivity.


Causes of carrier outages

Most network engineers have experienced the rapid-fire text messages and flurry of questions when the Internet stops working.  It’s important to understand the upstream causes of these outages so we can work with our carriers to mitigate them. The first, and most common, is the dreaded fiber cut. Regardless of the cause, a fiber cut in the wrong location can have widespread impacts. Second, an upstream provider issue can interrupt service. While less frequent than a fiber cut, these outages can be frustrating because, although your circuits and peerings are healthy, traffic does not flow properly. Third, DDoS attacks, whether directly targeting your organization or another customer on your provider’s network, can have a crippling impact on service availability. 


Managing Around a Fiber Cut

A few different approaches can help mitigate the impacts of a fiber cut.  Your organization can purchase circuit diversity from a single carrier. In this scenario, your carrier will engineer multiple circuits into your facility.  As part of this service, they will evaluate the physical path each circuit follows and ensure the circuits do not ride the same cable or poles. For true diversity, you’ll need to be certain that circuits take different paths into your facility. And, if circuits terminate into a powered cabinet, you must verify the reliability of the power source for that gear. Ask lots of questions and hold your carrier accountable. Be certain that they are contractually obligated to provide diversity because there are penalties if they fail to do so.  Work with an engineer for your carrier; don’t take the sales rep’s word for it.  A single provider should have a complete view of the physical path for your circuits and be able to guarantee physical diversity.  Unfortunately, however, using single carrier puts you at a higher risk for an upstream configuration our routing failure with that provider.


The Multiple Carrier Route

Instead of ordering path diversity from a single carrier, you can order two circuits from different providers. This option reduces your reliance on a single carrier, but makes it more difficult to ensure full path diversity. You will need to talk to your carriers about sharing the physical path information for the circuits with you or with one another. You’ll still want to be certain the circuits enter the building via a different conduit and terminate into properly powered equipment. If you use different carriers, you will need to pay special attention to your BGP configuration to verify that the path in and out of your network is what you expect.


An Important Note about Grooming

Even if you do everything right — you validate proper path diversity when you order a circuit, you pay special attention to the entrances into your building, and you verify that all vendor equipment is properly powered — things can change. Carriers will periodically groom circuits to change the path they follow through their network.  An industrious provider engineer may see that a circuit follows a less-than-optimal path through their network and then diligently re-engineer it to be more efficient. You will not be notified when the grooming takes place; it will be transparent to you, the customer. The only way to prevent grooming is to communicate clearly with your carrier and ask that they mark circuits that have been carefully engineered for path diversity to prevent them from being groomed.


As with most topics is networking, there are many factors to consider and tradeoffs to be made when ordering connectivity for your organization. You cannot have complete control over carrier-provided connectivity, but you can be diligent throughout the process, communicate the challenges clearly with your leadership, and be clear with your service provider about your expectations and the level of service being provided.

We’ve all heard the saying, "What you see is what you get." Life isn’t quite so simple for those focused on security, as what you don’t see is more likely to be what you get. Luckily, there are places to gain visibility in places that are often overlooked.


Security policies have always included the protection of key assets such as servers, network infrastructure, and data center and perimeter devices. This approach will always be the first line of defense. And for those who are new to the security space, this is the best place to start.


More recently, security policies have been extended to the user level. The number of endpoint protection solutions has grown markedly over the last few years as security administrators have understood that protection from attacks initiated from inside an organization is critical. These attacks are able to leverage users and their devices because, like it or not, people do download things they shouldn’t, they visit websites they shouldn’t, they share files, they let their kids use their company assets, and, they often fall prey to social engineering.


Endpoint Protection (EPP) has existed since the 1980s in the form of virus-scanning clients. Over the years, the EPP landscape has become a battle of the Advanced Endpoint Protection (AEP) products. AEPs are next-gen technology, combining EPP functions, like anti-virus, with event detection and response (EDR) technology providing detection, blocking, and forensic analysis capabilities. In addition, operating systems like Windows provide a selection of endpoint tools that can be enabled out of the box.


In the Microsoft world, implementing an endpoint protection strategy can start with an often overlooked feature; Windows Event Logging. Event logging provides visibility into the activities performed on the workstation by grouping application, security, and system events into a single view. The workstation event console may then be configured to forward a customized set of these events to a log aggregator like a domain controller allowing the administrator to have a consolidated view of the activities on the workstations in the domain. These consolidated events can then be further forwarded to a SIEM and used as an alert trigger (detection of an APT) or provide contextual value (workstation state for a specific user on a device that attempted a brute force attack on a key server). More of this in a later blog.


To decide if Workstation Event Logs have a place in your overall security strategy, consider these use cases:


  • Access: How secure are the local authentication policies of individual workstations? If an attempt is made to log in to a device using a local access credential rather than a domain controlled account, it will be logged in the workstation event log only.
  • Persistence: Registry changes made by an attacker to provide a foothold into the system that persists over system reboots must be tracked.
  • Discovery: IoCs can be recognized by anomalous actions, for example, events reporting misspelled service names, uncommon service paths or non-typical application crashes due to buffer overflows.
  • Reconnaissance: Running of tools that indicate scanning, recon, and brute force attacks may have been attempted can be logged.
  • Forensics:  In the case of a breach, building an event timeline from initial compromise to detection is critical to understanding how to recognize the extent of the compromise across multiple machines and how to remediate these systems.
  • Behavioral Analysis: Changes in user behavior or inappropriate use of company assets can have both security and legal implications. If certain event types, like failed logins or privilege escalation attempts, begin to occur, or known exploitation tools are installed on a system, this could be a sign of a compromise or a potential issue with an employee.


As with any logging tool, the trick is to create a configuration and deployment strategy. One of the downsides to event collection is that a poorly tuned system can generate far too many events to be useful or even viable.  Admins must identify critical events to collect based on how they impact their environment and have an action plan defined for addressing issues. This ensures an understanding of the context and implications of an event; the rule of thumb is that proactive beats reactive.


If this post has you thinking about workstation logging, future blogs will provide more information about defining your security policy, configuring endpoints, and forwarding events to an aggregation device and making use of logs in SIEMs. Stay tuned.

In Austin this week, filming an episode of SolarWinds Lab. I heard there may be snow in the forecast there. I’m starting to get the sense that winter hates me.


As always, here are some links from the Intertubz that I hope will hold your interest. Enjoy!


Google Technique Offers Spectre Vulnerability Fix with No Performance Loss

Considering they spent six months working on this, I’m not surprised. What does surprise me is the comment about how they shared their fix with other companies, including competitors. Maybe we are starting to see the beginnings of cooperation that will result in better security for us all.


Timing of $24 million stock sale by Intel CEO draws fire

Move along, nothing to see here.


Game of Drones – Researchers devised a technique to detect drone surveillance

We aren’t far away from robot armies.


Cortana had a crappy CES

For all of the money that Microsoft spends on marketing the various products and services they have to offer, I am surprised that they didn’t jump at the chance to have Cortana featured at CES.


Power restored to CES show floor after 2-hour blackout

Well, at least Cortana wasn’t to blame. And I think this shows that we are only one 24-hour blackout away from descending into total chaos as a nation.


Meltdown-Spectre: Four things every Windows admin needs to do now

Good checklist to consider, especially the DON’T PANIC part.


The puzzle: Why do scientists typically respond to legitimate scientific criticism in an angry, defensive, closed, non-scientific way? The answer: We’re trained to do this during the process of responding to peer review.

Easily the longest title ever for an Actuator link. Have a read and think about how scientists are very, very human. We are all trained to be defensive, I find this especially true in the field of IT. I’ve certainly seen this happen in meetings and in online forums.


The struggle is real:

By Paul Parker, SolarWinds Federal & National Government Chief Technologist


It’s the time of year when we look toward the future. Here's an interesting article from my colleague, Joe Kim, where he provides a few predictions.


Want a good idea of what’s coming next in federal IT? Look no further than the financial services industry.


Consider the similarities between financial firms and government agencies. Both are highly regulated, strive for greater agility, efficiency, and control of their networks and data. Also, cybersecurity remains a core necessity for organizations in both industries.


Technologies that have become popular in the financial services industry are making hay in federal IT. Let’s focus on three of these—blockchain, software-defined networking (SDN) and containers—and explore what they mean for agencies’ network management and security initiatives.




A blockchain is a digital ledger of transactions and ordered records, or blocks. It’s an easily verifiable, distributed database that can be used to keep permanent records of all the transactions that take place over a network.


While invented to record financial services for bitcoin transactions, blockchain can be a powerful tool for better data security. For example, governments are using blockchain to provide services to citizens. There’s even a Congressional Blockchain Caucus dedicated to educating government officials on its benefits.


Blockchain is far from the only solution that agencies should consider, however. Traditional network monitoring, which allows for automated threat detection across the network, and user device monitoring are still the bread and butter of network and data security.




SDN is another technology that many financial services firms and agencies have explored as a means of solidifying network security. SDNs are more easily pliable and readily adaptable to respond to evolving threat vectors. They also provide network managers with central control of the entire network infrastructure, allowing them to respond more quickly to suspicious activity.


But an SDN is still only as good as its network management protocols, which must be equipped to adequately handle virtual networks. Managers must be able to monitor end-to-end network analytics and performance statistics across the network, which, with SDN, are likely to be very abstract and distributed. Special care must be taken, and the appropriate tools deployed, to help ensure that managers maintain the same amount of network visibility in an SDN as they would have with a traditional network.




For organizations seeking a more streamlined approach to application development, Linux® containers are like nirvana. Essentially extremely lightweight and highly portable application development environments, containers offer the promise of much shorter development times and substantial cost savings. Because of these benefits, banking giants like Goldman Sachs® and Bank of America® are using containers and there is also growing federal government interest.


However, there have been concerns around container security. Because there are many different container platforms available, it is tricky to design a standard security tool that works well with all of them. Containers comprise multiple stacks and layers, each of which must be secured individually. There’s also the inherent nature of containers, which, on its surface, appears to be staunchly anti-security because of their ephemeral and transportable nature.


Federal developers who are considering using containers need to be aware of these security implications and risks. Although container security has gotten a lot better over the years, agencies should still consider taking steps to secure their containers or use enterprise-hardened container solutions that comply with federal guidelines and recommendations, such as those laid out in the NIST® Application Container Security Guide.


We clearly are in the midst of a technological revolution. While financial services and other non-government industries have thus far been the primary torchbearers for this movement, the federal government is now ready to take the lead. With blockchain, SDN, and containers, federal IT professionals have three innovative technologies to use—along with traditional network management practices—to strengthen security and innovation.


Find the full article on our partner DLT’s blog Technically Speaking.

When organizations first take on the challenge of setting up a disaster recovery plan, it’s almost always based on the premise that a complete failure will occur. With that in mind, we take the approach of planning for a complete recovery. We replicate our services and VMs to some sort of secondary site and go through the processes of documenting how to bring them all up again. While this may be the basis of the technical recovery portion of a DR plan, it’s important to take a step back before jumping right into the assumption of having to recover from a complete failure. Disasters come in all shapes, forms, and sizes, and a great DR plan will accommodate for as many types of disasters possible. For example, we wouldn’t use the same “runbook” to recover from simple data loss that we would use to recover from the total devastation of a hurricane. This just wouldn’t make sense. So even before beginning the recovery portions of our disaster recovery plans we really should focus on the disaster portion.


Classifying Disasters


As mentioned above, the human mind always seems to jump into planning for the worst-case scenario when hearing the words disaster recovery: a building burning down, flooding, etc. What we fail to plan for is other, minor, less significant disasters, such as temporary loss of power or loss of entrance due to quarantine. So, with that said, let’s begin to classify disasters. For the most part, we can lump a disaster into two main categories:


Natural Disasters – these are the most recognized types of disasters. Think of events such as a hurricane, flooding, fire, earthquake, lightning, water damage, etc. When planning for a natural disaster, we can normally go under the assumption that we will be performing a complete recovery or avoidance scenario to a secondary location.


Man-made Disasters – These are the types of disasters that are lesser known to organizations when looking at DR. Think about things such as temporary loss of power, cyberattacks, ransomware, protests, etc. While these intentional and unintentional acts are not as commonly approached, a good disaster recovery plan will address some of these as the recovery from them is often much different from that of a natural disaster.


Once we have classified our disaster into one of these two categories, we can then move on by further drilling down on the disasters. Performing a risk and impact assessment of the disaster scenarios themselves is a great next step. Answers to questions like the ones listed below should be considered when performing our risk assessment because it allows us to further classify our disasters, and, in turn, define expectations and appropriate responses accordingly.


  • Do we still have access to our main premises?
  • Have we lost any data?
  • Has any IT function been depleted or lost?
  • Do we have loss of skill set?


How these questions are answered as it pertains to a disaster can completely change our recovery scenarios. For example, if we have had a fire in the data center and lost data, we would most likely be failing over to another building in a designated amount of time. However, if we had also lost employees, more specifically IT employees in that fire, as well, then the time to recover will certainly be extended as we most likely would have lost skill sets and talent to execute the DR plan. Another great example comes in the form of ransomware. While we still would have physical access to our main premises, the data loss scenario could be much greater due to widespread encryption form the ransomware itself. If our backups were not air-gapped or separate from our infrastructure, then we may also have encrypted backups, meaning we have lost an IT function, thus provoking a possible failover scenario even with physical access to the building.  On the flip side, our risks may not even be technical in nature. What is the impact of losing physical access to our building in the result of protests or chemical spills?  Some disasters like this may not even require a recovery process at all, but still pose a threat due to the loss of access to the hardware.


Disaster recovery is a major undertaking, no matter what size the company or IT infrastructure, and can take copious amounts of time and resources to get it off the ground. With that said, don’t make the mistake of only planning for those big natural disasters. While it may be a great starting point, it’s best to really list out some of the more common, more probable types of disasters as well, document the risks and recovery steps in turn. In the end, you are more likely to be battling cyber attacks, power loss, and data corruption then you are to be fighting off a hurricane. The key takeaway is – classify many different disaster types, document them, and in the end, you will have a more robust, more holistic plan you can use when the time comes. I would love to hear from you in regards to your journeys with DR. How do you begin to classify disasters or construct a DR plan? Have you experienced any "uncommon" scenarios which your DR plan has or hasn't addressed? Leave some comments below and let's keep this conservation going.

Back in the saddle this week, feeling rested and ready to get 2018 started. We had quite a few interesting stories last week, too. Never a dull moment in the field of technology.


As always, here are some links from the Intertubz that I hope will hold your interest. Enjoy!


A Simple Explanation of the Differences Between Meltdown and Spectre

In case you didn’t hear, our CPUs have been hacked. Well, they could be hacked. We should all be panicking. Or not worried at all. It’s hard to say, really, because there is a lot of misinformation going around right now about Meltdown and Spectre. This article helps clear up a few things. Also, it shows that we’ve now hit a point in time where we create logos for security vulnerabilities. What a time to be alive.


The No Good, Terrible Processor Flaw and SQL Server Deployments – Nearly Everything You Need To Know

Here’s a great summary of how Meltdown and Spectre may affect SQL Server workloads. There’s a lot of FUD being spread about the performance hit for the patches, this article will help you focus one the important details.


All the cool new friends you'll meet when you drink raw water

Here’s the upside to the diseases that raw water can carry: only hipsters inside of Silicon Valley will be affected at first.


The Intolerable Speech Rule: the Paradox of Tolerance for tech companies

Someone needs to get this article in front of Jack Dorsey at Twitter. It’s a simple enough rule that would remove so many of the jerks using their service.


You’re Descended from Royalty and So Is Everybody Else

In case you got one of those DNA kits as a gift this year, I‘m here to ruin the surprise for you. We’re all related to royalty because math.


A practical guide to microchip implants

I don’t think I’m ready for this future yet.


DHS Says 246,000 Employees' Personal Details Were Exposed

The word ‘Security’ is literally in their name, you would think they could do that part right.


I found Heaven here on Earth:


The use of cloud technology and services--especially public cloud--has become nearly ubiquitous. For example, it has made its way into even the most conservative organizations. Despite the fact that some find it challenging to support the service following adoption, the supportability resides with the public cloud provider. The business unit that decides to leverage public cloud is on their own. And while we’re at it, well done for them, because they didn’t want to use our own internal infrastructure or private cloud, if we’re a more advanced organization).


Sometimes It Isn't Up to IT

But to what extent does this binary (and somehow logical) vision of things hold true? The old adage that says, "If it has knobs, it’s supported by our internal IT departments" is once again proving to be correct. Even with public cloud, an infrastructure that is (hopefully) managed by a third-party provider, there are very meager chances that our organization will exonerate us from the burden of supporting any applications that run in the cloud. Chances are even slimmer for IT to push back on management decisions: they may seem inconsiderate from an IT perspective, but make sense (for better or worse) from a business perspective.


Challenges Ahead

With business units’ entitlement to leverage cloud services comes the question about which public clouds will be leveraged, or rather the probability that multiple cloud providers will be used without any consideration of IT supportability of the service. This makes it very difficult for IT to support and monitor the availability of services without having IT operations jump from monitoring console on cloud provider A to their on-premises solution, and then back to cloud provider B’s own panel of glass.


With that comes the question of onboarding IT personnel into each of the public cloud providers' IAM (Identity & Access Management) platforms, manage different sets of permissions for each of the applications and each of the platforms. This adds heavy and unnecessary management overhead on top of IT responsibilities.


And finally comes the relevance of monitoring the off-premises infrastructure with off-premises tools, such as those provided by public cloud operators. One potential issue, although unlikely, is the unavailability of the off-premises monitoring platform, or a major outage at the public cloud provider. Another issue could be, in the case where an internal process relies on an externally hosted application, that the off-premises application reports as being up and running at the public cloud provider, and yet is unreachable from the internal network.


The option of running an off-premises monitoring function exists, but it presents several risks. Beyond the operational risk of being oblivious to what is going on in case of a network outage/dysfunction (either because access to the off-premises platform is unavailable, or because the off-premises solution cannot see the on-premises infrastructure) is the more serious and insidious threat because it exposes an organization’s entire network and systems topology to a third-party. While this may be a minor problem for smaller companies, larger organizations operating in regulating markets may think twice about exposing their assets and will generally favor on-premises solutions.


Getting Cloud Monitoring Right

Cloud monitoring doesn’t differ from traditional on-premises infrastructure monitoring, and shouldn’t constitute a separate discipline. In the context of hybrid IT, where boundaries between on-premises and off-premises infrastructures dissolve to place applications at the crossroads of business and IT interests, there is intrinsic value to be found with on-premises monitoring of cloud-based assets.


A platform-agnostic approach to monitoring on-premises and cloud assets via a unified interface, backed by the consistent naming of metrics and attributes across platforms will help IT operators instantly understand what is happening, regardless of the infrastructure in which the issue is happening, and without necessarily having to understand or learn the taxonomy imposed by a given cloud provider.


IT departments can thus attain a holistic view that goes beyond infrastructure silos or inherent differences between clouds, and focus on delivering the value that business expects from them. Guarantee the availability and performance of business systems, regardless of their location, and ensure the monitoring function is not impacted by external events while respecting SLAs and maintaining control over their infrastructure.

By Paul Parker, SolarWinds Federal and National Government Chief Technologist


I'm the new Chief Technologist for our Federal and National Government team, and I’m glad to be joining the conversation on THWACK® with all of you. Here's an interesting article from my colleague, Joe Kim, in which he argues that military IT professionals can and should adopt a more proactive approach to combatting cyberattacks.


Today’s cyberattackers are part of a large, intelligent, and perhaps most dangerously, incredibly profitable industry. These attacks can come in all shapes and sizes and impact every type of government organization. In 2015, attackers breached the DoD network and gained access to approximately 5.6 million fingerprint records, impacting several years' worth of security clearance archives. This level of threat isn't new, but has grown noticeably more sophisticated—and regular—in recent years.


So why are defense organizations so vulnerable?


Brave new world


Military organizations, just like any other organizations, are susceptible to the changing tides of technology, with Warfighter Information Network-Tactical (WIN-T) offering an example of the challenges it faces. WIN-T is the backbone of the U.S. Army’s common tactical communications network, and is relied upon to enable mission command and secure reliable voice, video, and data communications at all times, regardless of location.


To help ensure “always on” communications, network connectivity must be maintained to allow WIN-T units to exchange information with each other and carry out their mission objectives. WIN-T was facing bandwidth delay and latency issues, resulting in outages and sporadic communications. They needed a solution that was powerful and easy to use. This is an important lesson for IT professionals tasked with adopting new and unfamiliar technology.


WIN-T also required detailed records of their VoIP calls to comply with regulatory requirements. Available solutions were expensive and cumbersome, so WIN-T worked with its solution provider, SolarWinds, to develop a low cost VoIP tool that met their technical mission requirements.


The WIN-T use case demonstrates that defense departments are looking to expand and diversify their networks and tools. This has created a new challenge for military IT professionals who must seamlessly incorporate complex new technologies that could potentially expose the organization to new vulnerabilities.


Impact of a breach


Military organizations are responsible for incredibly sensitive information, from national security details to personnel information. When the military suffers a cyberattack, there are far greater implications for it and the society as a whole.


If a military organization were breached, for example, and sensitive data fell into the wrong hands, the issue would become a matter of national security, and lives could be put at risk. The value of military data is astronomical, which is why attackers are growing more focused on waging cyberwarfare against military organizations. The higher the prize, the greater the ransom.


However, it's not all doom and gloom, and military IT professionals do have defenses to help turn the tide in the fight against cyberattackers. The trick is to be proactive.


Be proactive


Far too many organizations rely on reactive techniques to deal with cyberattacks. Wouldn't it be far less damaging to be proactive, rather than reactive? Of course, this is easier said than done, but there are ways in which military IT professionals can take a proactive approach to cybercrime.


First, they should apply cutting-edge technology. Outdated technologies essentially open doors for well-equipped attackers to walk through. IT professionals should be given the support needed to implement this technology, if military organizations are serious about safeguarding against cyberattacks.


By procuring the latest tools, and ensuring internally that departments are carrying out system updates when prompted, military organizations can help protect themselves against the sophisticated techniques of cyberattackers.


Second, automation should be employed by military organizations as a security tool. By automating processes—from patch management to reporting—they can help ensure an instantaneous reaction to potential threats and vulnerabilities. Automation can also help safeguard against the same type of breach in the future, providing an automated response should the same issue occur.


Third, all devices should be tracked within a military organization. This may sound paranoid, but many breaches are a result of insider threats, whether it's something as innocent as an end-user plugging in a USB, or something altogether more sinister.


Automation can be used to detect unauthorized network access from a device within the organization, enabling the system administrators to track and locate where the device is, and who may be using it.


Despite the fear surrounding data breaches, military organizations are capable of standing firm against the next wave of innovative, ingenious cyberattacks.


Find the full article on Government Computing.



(This is the third part of a series. You can find Part One here and Part Two here.)


It behooves me to remind you that there are many spoilers beyond this point. If you haven't seen the movie yet, and don't want to know what's coming, bookmark this page to enjoy later.


Having tools without understanding history or context is usually bad.


On the flipside of using tools creatively, which I will discuss in the next part of the series, is using tools without understanding their context or history.


There are two analogs for this in the movie. First is how Charles can't remember the Westchester Incident. He continues to operate under the assumption that Logan is tormenting him for some reason, forcing him to live in a toppled-over well, and then dragging him cross-country when they are discovered. In reality, they'd been hiding from the repercussions of Charles' psychic outburst. But lacking that knowledge, Charles is ineffectual in helping their cause.


The second example is "X24,” an adult clone of Logan and something of a mindless killing machine. X24 is Logan without context, without history, without a frame of reference. And therefore, he is without remorse.


Both of these cases exemplify the harm that can come when a tool is operated by a user who doesn't fully understand why the tool exists or everything it is designed to do. It is nmap in the hands of a script kiddy.


As "experienced" IT professionals (that's code for "old farts"), one of our key goals should be sharing history and context with the younger set. As I wrote in "Respect Your Elders" (, everything in IT has a reason and a history. Forgetting that history can not only make you less effective, it can be downright dangerous. But newcomers to our field aren't going to learn that history from books. They're going to learn it from us if we are open and willing to share.


Lynchpin team members become force-multipliers, even if their specific contribution wasn't the most impactful.


In the movie, Logan shows up at a final battle. He doesn't defeat everyone and technically all the kids should have been able to hold their own. But when he appeared, it galvanized them into working together.


A little earlier I mentioned that the mutant kids are able to hold their own against an army of reavers, robotically enhanced mercenaries intent on capturing and/or killing the children before they reach the Canadian border.


I should have mentioned that they are just barely holding their own. Before long, most are captured. It is only due to the timely arrival of Logan that they are able to regain the upper hand. And even then, Logan is the one who has to take on X24, their most powerful adversary.


Granted, it is Laura who ultimately ends the conflict with X24. Granted it is the kids who disarm, disable, or kill the bulk of the soldiers.


But Logan's appearance changes the tide of the battle. Before he arrives, the kids are being picked off one by one. The reavers control the situation, they understand each kid, and are able to neutralize their abilities with precision. After Logan appears on the scene, the reavers are fighting on two fronts and it disrupts their efforts, causes them to make careless mistakes, and ultimately costs them the fight.


In this moment, Logan is known as a "force multiplier," a tool, technique, or individual who dramatically increases the efficacy of the team. In effect, a force multiplier makes a group work as if they have more members, or have members with a greater range of skills, than they actually possess. While the concept is most commonly understood within military contexts, the fact is that many areas of work benefit from the presence of force multipliers.


In IT, we need to learn to acknowledge when a technology, technique, or even an individual (regardless of age or experience) is a force multiplier. We need to also understand that a force multiplier isn't a universal panacea. Something (or someone) who is a force multiplier in one context (day-to-day operations) isn't necessarily going to have the same effect in a different situation (rapid deployment of a new architecture).


It's okay to lie as long as you're telling the truth.


There are times in your IT career when you're going to need to lie. Not a little white "because the birthday cake is in the kitchen and we're not ready for you to come in yet" lie. Not a bending of the truth. I’m talking full-on, bald-faced lie.


You're going to get the email instructing you to disable someone's account at 2:00 p.m. because they're being let go. And then you're going to see that person in the hall and exchange pleasantries.


A co-worker will confide to you that they just got an amazing job offer, but they're not planning on giving notice for another two weeks. After that, you're going to be in a meeting with management offering staffing projections for the coming quarter, and you are going to feign acceptance that your co-worker is part of that equation.


Going back to the dinner scene on the farm with the Munroe family, the exchange about the school goes something like this:

Logan: “Careful, you're speaking to a man who ran a school… for a lot of years.”

Charles: “Well, that's correct. It was a… it was a kind of special needs school.”

Logan: “That's a good description.”

Charles: (indicating Logan) “He was there, too.”

Logan: “Yeah, I was in it, too. I got expelled out three times.”

Charles: “I wish I could say that you were a good pupil, but the words would choke me.”


From the Munroes’ point of view, this is a father and son reminiscing about their past. And you know what? It IS a father and son reminiscing about their past. All of the things they say have an emotional truth to them, even if they are a complete fabrication.


IT pros have access to so many systems and sources of insight that our non-IT co-workers can’t "enjoy." Therefore, we must endeavor to maintain the emotional truth of each situation, even when we have to mask the details.


But that isn't all I learned! Stay tuned for future installments of this series. And until then, Excelsior!


1 “Logan” (2017), Marvel Entertainment, distributed by 20th Century Fox


Welcome to 2018!


Just three days into the new year, Spectre and Meltdown made the news. These flaws affect both system security and performance since they degrade CPU performance significantly. Previously, we saw prominent companies use software to manipulate older generation devices. And everyone seemed to be launching ICOs and adding blockchain or bitcoins to their company portfolio to ride the cryptocurrency bubble expansion.


The year ahead promises to be an exciting (for lack of a better descriptor) one for IT pros, developers, DevOps practitioners, and every other role you choose to claim for yourself. Check out the teaser video below:




And, don't forget to check out the complete list of pro-dictions from adatole patrick.hubbard Dez sqlrockstar and myself, click the banner at the top of this post. We cover IoT, blockchain, data security, compliance, and more. Will our predictions turn out to be prophetic or will they fail to come true? Let us know what you think in the comment section below.


The Legacy IT Pro

Posted by kpe Jan 4, 2018

In the fast-paced world of IT, can you afford to be a legacy IT pro? This is a concern for many, which makes it worth examining.


IT functions have been clearly separated since the early days of mainframes. You had your storage team, your server team, your networking team, and so on, but is that really the way we should continue, moving forward? Do we as IT pros gain anything by keeping up with this status quo? If you and your organization stay on this path, how long do you think you can you keep it up?


The best way to define a legacy pro is to share a few examples. Let’s say you were hired to be on the server team in a given enterprise environment around 2008. If you have not developed your skill set beyond Microsoft® Windows Server® 2008 or any related area since then, that’s legacy. A lot has happened in nine years, especially in cloud and security sectors. That means that if you haven’t kept up with the latest technologies, you’ll likely end up being one of those legacy guys.


In networking, my specialty, the same definition applies. If you are a data center networking engineer and you are still doing three-tier design with spanning tree and all that good stuff, you are clearly missing out on the most recent trends.


So, the key take away here is, don’t be afraid to rejuvenate yourself AND the tools of your trade. Going back to our first example, ask yourself if you are really living up to your job title. Gone are the days of updating to a new software release every second year, or whatever your company policy used to be. You really need to tell your vendor of choice to go with update cycles that match the trends of the market.


Now that you have progressed from a legacy IT pro to the next level, how do you take this even further? My suggestion is that you evolve from being a great IT pro to being an individual who has knowledge beyond your own area of expertise. It’s probably time you started envisioning yourself as a solution engineer.


A recurring theme these days is for clients to want a complete solution. In other words, organizations really do not want to deal with a collection of IT silos; they’d prefer to treat IT as a whole. This means that your success as an engineer on the networking/server/storage team is not only dependent on your own performance, but also that of your fellow engineers.


To deliver on this promise of a solution, you really need to start getting comfortable dealing with engineers and support staff from different parts of your organization. It doesn’t matter if you work in a consultancy role or in enterprise IT, this is something you need to start gradually incorporating into your workflow.


I suggest you start by establishing communication lines across your organization. Be open about your own job domains and tasks. Buy that co-worker from servers a cup of coffee and be genuinely interested in his/her area of expertise. Ask questions and show appreciation for his or her work.


Don’t be afraid to bring this level of cooperation to the attention of management to gain some traction across multiple business units. More often than not, you will get this level of support if you offer solutions that provide value.


Start sharing software tools and features across silos to spark further interest and energy into this new way of thinking. Perfstack now allows you to customize panes of glass according to individual teams and groups. Why not utilize this to create a specific view for the storage team that gives them visibility into your Netflow data?


I am not advocating a complete abandonment of your current role. I am suggesting instead that you transform your specialization into a new multi-level sphere of expertise. If you are on the networking team, go full speed ahead with that, but also pay attention to what is happening in the world of compute and maybe storage. Read about the topic, or even get some training on it. That way you are not completely oblivious to what’s going on around you, which makes communicating across the organization even easier. Doing these things will make you a better engineer and confirm that you are a true asset to your company. In the end, isn’t that what it’s all about?


To summarize, I do think it’s very important to evolve in this industry. If we are to meet future demands, we need to start thinking and acting differently. By gaining new skill sets and breaking down the silos we have built up over the years, we are on a clear path of evolution. Instead of being afraid of this evolution, look at it with a positive attitude and see all the possible opportunities that arise because of it.


With that in mind, I wish you the very best. Take care and go forth into this new era of IT!



I’m still on holiday, but that won’t stop me from getting the Actuator done this week. I hope everyone had a safe and happy holiday season with family and friends. Let’s grab 2018 by the tail, together.


As always, here are some links from the Intertubz that I hope will hold your interest. Enjoy!


A Message to Our Customers about iPhone Batteries and Performance

I’m stunned by this response from Apple. I don’t ever recall a company standing up like this. It’s clear they know they have a bit of an image problem right now, and are taking every step possible to earn back consumer confidence.


The Galaxy Note 8 reportedly has a battery problem of its own

But the good news is that they aren’t catching fire...yet.


Computer latency: 1977-2017

Ever wonder if your old computer from childhood was faster than the one you have today? Well, wonder no more! Read to find out how the Apple 2e is the fastest machine ever built.


Net Promoter Score Considered Harmful (and What UX Professionals Can Do About It)

Including this because it uses math and data to prove a point about a metric that I believe is widely misunderstood to be a good thing.


Crime in New York City Plunges to a Level Not Seen Since the 1950s

Another link because math. It’s important to understand that having all the data doesn’t mean you can have all the answers. Nobody really knows why crime is dropping. Which means they don’t know why it will begin to rise, or when.


Ten years in, nobody has come up with a use for blockchain

Look for this article again next year, when the title is updated to “Eleven years”.


17 Things We Should Have Learned in 2017, but Probably Didn't

Wonderful list of mistakes that are likely to be mistakes again in 2018 (and beyond).


By Joe Kim, SolarWinds EVP, Engineering and Global CTO


Social media has given us many things, from the mass circulation of hilarious cat videos, to the proliferation of memes. However, social media is not commonly thought of as a tool for cybercriminals, or a possible aid in combatting cybercrime.


However, as government IT pros frantically spend valuable time and money bringing in complex threat-management software, one of the methods most easily used by hackers is right in front of you—assuming you’ve got your favorite social media page open.


Social skills

Social media can be a tool to both protect and disrupt, and attackers are eagerly screening social media profiles for any information that may present a vulnerability. Any status providing seemingly innocuous information may be of use, revealing details that could be weaponized by hackers.


Take LinkedIn®, for example. LinkedIn provides hackers with a resource that can be used nefariously, by viewing profiles of system administrators, attackers can learn what systems they are working on. This is a very easy way for a cybercriminal to gain valuable information.


As mentioned, however, social media can also be a protective tool. By helping ensure that information is correctly shared within an organization, IT pros can more easily identify and tag attackers.


Cybercrime is organized within a community structure, with tools and tactics doled out among cybercriminals, making attacks faster and more effective.


This is a method that government IT pros need to mimic by turning to threat feeds, in which attack information is quickly shared to enable enhanced threat response. Whether it’s through an IP address or more complex behavioral analysis and analytics, a threat feed can help better combat cybercrime, and shares similar traits to social media.


For government IT pros, the most important part of this similarity is the ability to share information with many people quickly, and in a consumable format. Then, by making this information actionable, threats can be tackled more effectively.


Internal affairs

The internal sharing of information is also key, but not always a priority within government. This is a real problem, especially when the rewards of more effective internal information sharing are so significant. However, unified tools or dashboards that display data about the ongoing status of agency networks and systems can help solve this problem by illuminating issues in a more effective way.


Take performance data, which, for example, can tell you when a sudden surge in outbound traffic occurs, indicating someone is exfiltrating data. Identifying these security incidents and ensuring that reports are more inclusive will allow the entire team to understand and appreciate how threats are discovered. This means you can be confident that your organization is vigilant, and better equipped to deal with threats.


Essentially, government IT professionals should think carefully about what to post on social media. This doesn’t mean, however, that they should delete their accounts or start posting under some poorly thought-out pseudonym.


When used correctly, social media can provide public service IT professionals with more protection and a better understanding of potential threats. In a world where cyberattacks are getting ever more devastating, any additional help is surely worthy of a like.


Find the full article on PublicNet.

While the Word-A-Day  Challenge has only completely it's second year, it is already a labor of love for me. Last year the idea struck (as they so often do) in an unanticipated "a-ha!" moment, and with barely enough time to see it realized. As I explained at the time, the words were re-cycled from another word-a-day challenge I take part in yearly.


This year was different. I had time to think and plan, and that was especially true of the list of words I wanted to present to the THWACK community. I knew they had to be special. Important. Meaningful not just as words can be in their own right, but meaningful to us in the IT world.


As I selected the words for the word-a-day challenge, I looked for ones with a particular feel and heft:

  1. They had to be clearly identifiable as technology words
  2. More than that, they needed to be words which have an enduring place in the IT lexicon
  3. And they needed to also be words which have a significant meaning outside of the IT context


In addition to hoping that words with those attributes would inspire discussion and offer each writer a variety of options for inspiration,  I was also curious to see which way the ark of conversations in the comments would bend for each. Would the community focus solely on the technical aspect? Would they avoid the tech and go for the alternate meanings? Would there be representation from both sides?


To put it in more concrete terms, would people choose to write about backbone as an aspect of biology, technology, or character? Would Bootstrap appeal to folks more as a method or a metaphor?


To say that the THWACK community exceeded my wildest imaginings would actually be understatement (a crime I've rarely been accused of). Here at the end of 31 days of the challenge, the answer to my question is a resounding "all of the above". In writing, images, poems, and haiku, you left no intellectual stone un-turned.


More than that, however, was how so many of us took a technical idea and suggested ways we could use the same concepts to improve ourselves; or conversely, how we could take the non-technical meaning of a word and apply THAT to our technical lives. And through it all was a constant message of "we can do better. we can be better. we have so much more to learn. we have so much more to do."


And even more fundamentally, the message I read time and time again was "we can get there together. as a community. we can help each other be better."


For me, it brought to mind a quote by Michael Walzer:

"We still believe, or many of us do, what the Exodus first taught...

- first, that wherever you live, it is probably Egypt;

- second, that there is a better place, a world more attractive, a promised land;

- and third, that 'the way to the land is through the wilderness'.

There is no way to get from here to there except by joining together and marching."



I would like to thank everyone who took time out of their hectic end-of-year schedules - sometimes in their personal time over evenings and weekends - to comment so thoughtfully. And in that same vein I'm deeply grateful to the 22 writers who generated the 31 "lead" articles - 12 of whom this year came from the ranks of our incredible, inimitable, indefatigable THWACK MVP's. If you missed out on any of the days, I'm listing each post below to give you yet another chance to catch up.


Finally, I want to give a shout-out to the dedicated THWACK community team for helping manage all the behind-the-scenes work that allowed the challenge to go off without a hitch this year.


I am humbled to have had a chance to be part of this, and I'm already thinking about the words, ideas, and stories I hope we can share in the coming year.


Leon Adato
Eric CourtesyIT
Peter Monaghan, CBCP, SCP, ITIL ver.3
Joshua Biggley
Craig Norborg
Ben Garves
Kamil Nepsinsky
Richard Letts
Kevin Sparenberg
Jeremy Mayfield
Patrick Hubbard
Rob Mandeville
Karla Palma
Ann Guidry
Matt R
Jenne Barbour
Thomas Iannelli
Allie Eby
Richard Schroeder
Jenne Barbour
Abigail Norman
Mark Roberts
Zack Mutchler
Rainy Schermerhorn
Shelly Crossland
Jez Marsh
Michael Probus
Jenne Barbour
Jenne Barbour
Erik Eff
Leon Adato

Filter Blog

By date: By tag:

SolarWinds uses cookies on its websites to make your online experience easier and better. By using our website, you consent to our use of cookies. For more information on cookies, see our cookie policy.