Skip navigation
1 2 3 4 Previous Next

Geek Speak

2,408 posts

IT organizations manage security in different ways. Some companies have formalized security teams with board-level interest. In these companies, the security team will have firm policies and procedures that apply to network gear. Some organizations appoint a manager or director to be responsible for security with less high-level accountability. Smaller IT shops have less formal security organizations with little security-related accountability. The security guidance a network engineer receives from within their IT organization can vary widely across the industry. Regardless of the direction a network engineer receives from internal security teams, there are reasonable steps he or she can take to protect and secure the network.


Focus on the Basics


Many failures in network security happen due to a lack of basic security hygiene. While this problem extends up the entire IT stack, there are basic steps every network engineer should follow. Network gear should have consistent templated configuration across your organization. Ad-hoc configurations, varying password schemes, and a disorganized infrastructure opens the door for mistakes, inconsistencies, and vulnerabilities. A well-organized, rigorously implemented network is much more likely to be a secure network.


As part of the standard configuration for your network, pay special attention to default passwords, SNMP strings, and unencrypted access methods. Many devices ship with standard SNMP public and private communities. Change these immediately. Turn off any unencrypted access methods like telnet or unsecure web (http). If your organization doesn't have a corporate password vault system, use a free password vault like KeePass to store enable passwords and other sensitive access information. Don't leave a password list lying around, stored on Sharepoint, or unencrypted on a file share. Encrypt the disk on any computer that stores network configurations, especially engineer laptops which can be stolen or left accidentally.


To Firewall or Not to Firewall


While many hyperscalers don't use firewalls to protect their services, the average enterprise still uses firewalls for traffic flowing through their corporate network. It's important to move beyond the legacy layer 4 firewall to a next-generation, application-aware firewall. For outbound internet traffic, organizations need to build policy based on more than the 5-tuple. Building policies based on username and application will make the security posture more dynamic without compromising functionality.


Beyond the firewall, middle boxes like load balancers and reverse-proxies have an important role in your network infrastructure. Vulnerabilities, weak ciphers, and misconfigurations can leave applications and services wide open for exploit. There are many free web-based tools that can scan internet-facing hosts and report on weak ciphers and easy-to-spot vulnerabilities. Make use of these tools and then plan to remediate the findings.


Keep A Look Out for Vulnerabilities


When we think of patch cycles and vulnerability management, servers and workstations are top of mind. However, vulnerabilities exist in our networking gear too. Most vendors have mailing lists, blogs, and social media feeds where they post vulnerabilities. Subscribe to the relevant notification streams and tune your feed for information that's relevant to your organization. Make note of vulnerabilities and plan upgrades accordingly.


IT security is a broad topic that must be addressed throughout the entire stack. Most network engineers can't control the security posture of the endpoints or servers at their company but they do control networking gear and middle boxes which have a profound impact on IT security. In most instances, you can take practical, common sense steps that will dramatically improve your network security posture.

By Paul Parker, SolarWinds Federal & National Government Chief Technologist


Here's an interesting article from my colleague Leon Adato, in which he suggests that honesty is best policy.


IT professionals have a tough job. They face the conundrum of managing increasingly complex and hybrid IT platforms. They must protect their networks from continually evolving threats and bad actors. Budgets are restrictive and resources slim. And there are political agendas.


Given all of these factors, it’s understandable if we might feel compelled to tell some little white lies to ourselves on occasion. “Everything’s fine,” we might say, even if we’re not entirely sure that it is true. We might also be willing to engage in some little excuses and statements of overconfidence.


However, it’s important we acknowledge we may not have all the answers. We must continue to be honest with ourselves to avoid living in a world of gray.


You Don’t Know What You Don’t Know


Sometimes it’s more difficult to truly know how your infrastructure operates. That’s especially true in hybrid IT models. It’s very difficult to gain a complete view of our entire operation without the proper monitoring tools.


As pessimistic as that may seem, sometimes users aren’t honest, particularly in agency environments with very strict rules.


If an agency has a policy against using USB devices, for example, what happens if an employee breaks that rule and introduces the potential for unnecessary risk? From the confines of IT, it is sometimes difficult to assess what might be going on in other sections of the agency, which could pose some problems.


Unearthing the Truth = No More Little White Lies


Keeping everyone honest is essential to maintaining network integrity. The best way to do that is to adopt monitoring solutions and strategies that allow our IT teams to maintain visibility and control over every aspect of our infrastructure, from applications hosted off-site to the mobile devices used over networks.


We should adopt monitoring tools that are comprehensive and encompass the full range of networked entities. These solutions should also be able to provide insight into network activity regardless of whether the infrastructure and applications are on-site or hosted. We must be able to monitor activity at the hosting site and as data passes from the hosting provider to the agency.


After all, a true monitoring solution must monitor and provide a true view of what’s going on within the network. Shouldn’t it offer the ability to probe? To drill down? Those capabilities are essential if we are to truly unearth the root cause of whatever issues we may be trying to address or avert. And with the ability to monitor connections to external sources, we’ll be able to better identify break points when an outage occurs.


Let’s not forget everyone else in the agency. It’s important to keep tabs on network traffic to identify red flags and shine a light on employees who may be using unauthorized applications, again, as a means to keep everyone honest.


Being left in the dark may lead us to rely on half-truths simply because we lack the full picture. Instead of fooling ourselves, we should seek out solutions that provide us with true clarity into our networks, rather than shades of gray. This will result in more effective and secure network operations.


Find the full article on Nextgov.

If you have done any work in enterprise networks, you are likely familiar with the idea of a chassis switch. They have been the de facto standard for campus and data center cores and the standard top tier in a three-tier architecture for quite some time, with the venerable and perennial Cisco 6500 having a role in just about every network that I’ve ever worked on. They’re big and expensive, but they’re also resilient and bulletproof. (I mean this in the figurative and literal sense. I doubt you can get a bullet through most chassis switches cleanly.) That being said, there are some downsides to buying chassis switches that don’t often get discussed. In this post, I’m going to make a case against chassis switching. Not because chassis switching is inherently bad, but because I find that a lot of enterprises just default to the chassis as a core because that’s what they’re used to. To do this I’m going to look at some of the key benefits touted by chassis switch vendors and discussing how alternative architectures can provide these features, potentially in a more effective way.


High Availability


One of the key selling features of chassis switching is high availability. Within a chassis, every component should be deployed in N+1 redundancy. This means you don’t just buy one fancy and expensive supervisor, you buy two. If you’re really serious, you buy two chassis, because the chassis itself is an unlikely, but potential, single point of failure. The reality is that most chassis switches live up to the hype here. I’ve seen many chassis boxes that have been online for entirely too long without a reboot (patching apparently is overrated). The problem here isn’t a reliability question, but rather a blast area question. What do I mean by blast area? It’s the number of devices that are impacted if the switch has an issue. Chassis boxes tend to be densely populated with many devices either directly connected or dependent upon the operation of that physical device.


What happens when something goes wrong? All hardware eventually fails, so what’s the impact of a big centralized switch completely failing? Or more importantly, what’s the impact if it’s misbehaving, but hasn’t failed completely? (Gray-outs are the worst.) Your blast radius is significant and usually comprises most or all of the environment behind that switch. Redundancy is great, but it usually assumes total failure. Things don’t always fail that cleanly.


So, what’s the alternative? We can learn vicariously from our friends in Server Infrastructure groups and deploy distributed systems instead of highly centralized ones. Leaf-spine, a derivative of Clos networks, provides a mechanism for creating a distributed switching fabric that allows for up to half of the switching devices in the network to be offline with the only impact to the network being reduced redundancy and throughput. I don’t have the ability to dive into the details on leaf-spine architectures in this post, but you can check out this Packet Pushers Podcast if you would like a deeper understanding of how they work. A distributed architecture gives you the same level of high availability found in chassis switches but with a much more manageable scalability curve. See that section below for more details on scalability.




Complexity can be measured in many ways. There’s management complexity, technical complexity, operational complexity, etc. Fundamentally though, complexity is increased with the introduction and addition of interaction surfaces. Most networking technologies are relatively simple when operated in a bubble (some exceptions do apply) but real complexity starts showing up when those technologies are intermixed and running on top of each other. There are unintended consequences to your routing architecture when your spanning-tree architecture doesn’t act in a coordinated way, for example. This is one of the reasons why systems design has favored virtualization, and now micro-services, over large boxes that run many services. Operation and troubleshooting become far more complex when many things are being done on one system.


Networking is no different. Chassis switches are complicated. There are lots of moving pieces and things that need to go right, all residing under a single control plane. The ability to manage many devices under one management plane may feel like reducing complexity, but the reality is that it’s just an exchange of one type of complexity for another. Generally speaking it’s easier to troubleshoot a single purpose device than a multi-purpose device, but operationally it’s easier to manage one or two devices rather than tens or hundreds of devices.




You may not know this, but most chassis switches rely on Clos networking techniques for scalability within the chassis. Therefore, it isn’t a stretch to consider moving that same methodology out of the box and into a distributed switching fabric. With the combination of high speed backplanes/fabrics and multiple line card slots, chassis switches do have a fair amount of flexibility. The challenge is that you have to buy a large enough switch to handle anticipated and unanticipated growth over the life of the switch. For some companies, the life of a chassis switch can be expected to be upwards of 7-10 years. That’s quite a long time. You either need to be clairvoyant and understand your business needs half a decade into the future, or do what most people do: significantly oversize the initial purchase to help ensure that you don’t run out of capacity too quickly.


On the other hand, distributed switching fabrics grow with you. If you need more access ports, you add more leafs. If you need more fabric capacity, you add more spines. There’s also much greater flexibility to adjust to changing capacity trends in the industry. Over the past five years, we’ve been seeing the commoditization of 10Gb, 25Gb, 40Gb, and 100Gb links in the data center. Speeds of 400Gpbs are on the not-too-distant horizon, as well. In a chassis switch, you would have had to anticipate this dramatic upswing in individual link speed and purchase a switch that could handle it before the technologies became commonplace.




When talking about upgrading, there really are two types of upgrades that need to be addressed: hardware and software. We’re going to focus on software here, though, because we briefly addressed the hardware component above. Going back to our complexity discussion, the operation “under the hood” on chassis switches can often be quite complicated. With so many services so tightly packed into one control plane, upgrading can be a very complicated task. To handle this, switch vendors have created an abstraction for the processes and typically offer some form of “In Service Software Upgrade” automation. When it works, it feels miraculous. When it doesn’t, those are bad, bad days. I know few engineers who haven’t had ISSU burn them in one way or another. When everything in your environment is dependent upon one or two control planes always being operational, upgrading becomes a much riskier proposition.


Distributed architectures don’t have this challenge. Since services are distributed across many devices, losing any one device has little impact on the network. Also, since there is only loose coupling between devices in the fabric, not all devices have to be at the same software levels, like chassis switches do. This means you can upgrade a small section of your fabric and test the waters for a bit. If it doesn’t work well, roll it back. If it does, distribute the upgrade across the fabric.


Final Thoughts


I want to reiterate that I’m not making the case that chassis switches shouldn’t ever be used. In fact, I could easily write another post pointing out all the challenges inherent in distributed switching fabrics. The point of the post is to hopefully get people thinking about the choices they have when planning, designing, and deploying the networks they run. No single architecture should be the “go-to” architecture. Rather, you should weigh the trade-offs and make the decision that makes the most sense. Some people need chassis switching. Some networks work better in distributed fabrics. You’ll never know which group you belong to unless you consider factors like those above and the things that matter most to you and your organization.

I am in Germany this week, presenting sessions on database migrations and upgrades at SQL Konferenz. It’s always fun to talk data, and help people understand how to plan and execute data migration projects.


As always, here are some links from the Intertubz that I hope will hold your interest. Enjoy!


Intel partners with Microsoft, Dell, HP, and Lenovo to make 5G laptops

Stop me if you’ve heard this one before, but we are being told this next-gen network will solve all our problems.


Apple devices are butt dialing 9-1-1 from its refurbishing facility — 20 times per day

Hey, at least the calls are going through. Most days I can’t get any reception with mine.


Your orange juice exists because of climate change in the Himalayas millions of years ago

Forget bringing back dinosaurs, or fruits, I want to use DNA to bring back ancient bacon.


The FCC’s Net Neutrality Order Was Just Published, Now the Fight Really Begins

We are officially on the clock now, folks. And we still don’t know if this is a good or bad thing.


How to protect your browser from Unicode domain phishing attacks

I like the idea of browser extensions for safety, especially if they are part of a domain policy.


That microchipped e-passport you've got? US border cops still can't verify the data in it

Ten years. Embarrassing. On the upside, think of all the bottles of water they prevented from flying in that time.


The quantum internet has arrived (and it hasn’t)

I find myself fascinated by this concept. I’m also scared to think about my first quantum cable bill.


Made it to Darmstadt and it took only five minutes to find the best worscht in town:

So far in this series, we've covered setting expectations as well as migrating to Office 365. Now that your organization is up and running on the new platform, how do you measure your organization's health? Are you running as efficiently as you can, or were? Are there areas that you are being wasteful with? In this post, we'll cover some steps that you can take in order to give your organization a health check.




One of the great things about Office 365 is that there is no shortage of packages to choose from. Whether you are looking to host a single email account, or if you need to host your entire communications platform--including phones--there are packages that will fit. But how can you tell if you have "right-sized" your licenses?


Office 365 has an easy to understand activity report. Pulling up this report will let you see statistics on a lot of the services being offered. For example, you can see who is using OneDrive and how much data they are storing. You can also see how popular Skype for Business is amongst your users.


At a high-level, you can take this list and see who is or isn't using these features. Depending on the needs and the features, users might be able to be shifted to a lower tiered planned. Given the range of prices for the various plans, this can yield fairly significant savings.




Taking the same data above, you can find a list of folks who aren't using particular products or features. This is a perfect opportunity to find out why they aren't taking advantage of their license's full potential. Is it because they don't need the product/server? Maybe they aren't aware that they can access it. Or, maybe they don't know how to use it.


Taking this approach can be a great way to figure out what users need training and in what areas. Given how frequently Microsoft is adding new features to Office 365, it is also a great way to see adoption rates. Using this data, you can start to develop training schedules. Maybe once a month you can offer training sessions on some of the lesser-used areas. The great thing is, you will be able to easily tell if your training is useful by looking at the usage metrics again in the future.




One of the key points I highlighted back in the first post of this series was the value that this migration can bring to an enterprise. When planning out projects, we do so anticipating that we will get value out of this. Actually measuring the value after a migration is just as important. If reports and studies come back showing that your organization is, in fact, utilizing the new platform to its full potential, then great! If not, then you need to identify why not, and take the opportunity to fix it.


If you have been part of a team who migrated an enterprise environment to Office 365, how did you perform a health check? Did you uncover any surprises in your findings? Feel free to leave a comment below.


By Paul Parker, SolarWinds Federal & National Government Chief Technologist


It’s always good to have a periodic reminder to consider what we’re monitoring and why. Here's an applicable article from my colleague Joe Kim, in which he offers some tips on avoiding alert overload.


If you’re receiving so much monitoring information that you don’t see the bigger-picture implications, then you’re missing the value that information can provide. Federal IT pros have a seeming overabundance of tools available for network monitoring. Today, they can monitor everything from bandwidth to security systems to implementation data to high-level operational metrics.


Many federal IT pros are tempted to use them all to get as much information as possible, to ensure that they don’t miss even a bit of data that can help optimize network performance.


That is not the best idea.


First, getting too much monitoring information can cause monitoring overload. Why is this bad? Monitoring overload can lead to overly complex systems that, in turn, may create conflicting data. Conflicting data can then lead to management conflicts, which are counter-productive on multiple levels.


Second, many of these tools do not work together, providing a larger possibility for conflicting data, a greater chance that something important will be missed, and an even greater challenge seeing the bigger picture.


The solution is simpler than it may seem: get back to basics. Start by asking these three simple questions:


  1. For whom am I collecting this data?
  2. What metrics do I really need?
  3. What is my monitoring goal?


Federal IT pros should start by looking specifically at the audience for the data being collected. Which group is using the metrics—the operations team, the project manager, or agency management? Understand that the operations team will have its own wide audience and equally wide array of needs, so be as specific as possible in gathering “audience” information.


Once the federal IT pro has determined the audience, it will be much easier to determine exactly which metrics the audience requires to ensure optimal network performance—without drowning in alerts and data. Identify the most valuable metrics and focus on ensuring those get the highest priority.


The third question is the kicker, and should bring everything together.


Remember, monitoring is a means to an end. The point of monitoring is to inform and enhance operational decisions based on collected data. If a federal IT pro has a series of disconnected monitoring products, there is no way to understand the bigger picture; one cannot enhance operational decisions based on collected data if there is no consolidation. Opt for an aggregation solution, something that brings together information from multiple tools through a single interface that provides a single view.


Network monitoring and network optimization are getting more and more complex. Couple this with an increasing demand for a more digital government, and it becomes clear that gathering succinct insight into the infrastructure and application level of the IT operations within the agency is critical.


The most effective course of action is to get back to the basics. Focus on the audience and the agency’s specific needs. This will ensure a more streamlined monitoring solution that will help more effectively drive mission success.


Find the full article on Federal Technology Insider.



(This is the fourth and final part of a series. You can find Part One here, Part Two here and Part Three here.)


It behooves me to remind you that there are many spoilers beyond this point. If you haven't seen the movie yet, and don't want to know what's coming, bookmark this page to enjoy later.


New IT pros may take your tools and techniques and use them differently. Don't judge.


One of the interesting differences between Logan and Laura is that she has two claws that come from her hands (versus Logan's three), and one that comes out of her foot. Charles speculates that females of a species develop different weapons for protection versus hunting. Logan seems unimpressed even though he just witnessed Laura taking out at least three soldiers with her foot-claws alone.


The lesson for us is to remember that tools are there to be used. If it achieves the desired result and avoids downstream complications, then it doesn't matter if the usage diverges from "the way we did it in my day.” Thinking outside the box (something my fellow Head Geek, Destiny Bertucci, talks about all the time is a sign of creativity and engagement, two things that should never be downplayed.


Your ability to think will always trump the capability of your tools.


Yes, Logan is stab-y and can heal. But Charles, at the end of his life, can still flatten a city block.


And it is here where we descend into the realm of "who would win in a fight between Superman® and God?" This is, admittedly, a realm that the SolarWinds THWACK® March Madness bracket battle has been willing to take on for several years in a row



but I'm going to go there anyway. Logan/Wolverine® is one of the darlings of the X-Men® (and Marvel®) franchise. He's captured imaginations since his first appearance in 1974, and appeared in countless comics with the X-Men and solo. But even within the context of the X-Men movie franchise, he's far from the most powerful.


Magneto: “You must be Wolverine. That remarkable metal doesn't run through your entire body, does it?”


No, it's pretty clear that the most powerful being, certainly in Logan , but also in the mutant-verse, is Charles. Again, the ability to contact every human mind on the planet is nothing to sneeze at, and it puts healing ability and metal claws to shame.


Here’s what I want you to take from this: your ideas, thoughts, and ability to reason are the things that make you an IT powerhouse. It doesn’t matter that your PC has a quad-core processor and 128Gb of RAM. Nobody cares that your environment is running the latest container technology, or that your network has fiber-to-the-desktop. You have a veritable encyclopedia of CLI commands or programming verbs in your head? So what.


You are valued for the things that you do with your tools. Choose wisely. Think actively. Engage passionately.


It's never about what you do (or what you have achieved, fixed, etc.). The story of your IT career has always been and will always be about who you met, who you helped, and who you built a connection with.


The movie Logan is not, at its heart, about stabbing people in the head with metal claws, or car chases, or mutant abilities. While there is plenty of that, the core of the movie is about two men coming to terms with themselves and their legacy, and how that legacy will affect the world after they are gone.


It is a movie about the very real father-son relationship between Logan and Charles - how they love each other but wish the other could be "better" in some way. They understand that they cannot change the other person, but have to learn to live with them.


It is also about caring for another person: about whether we choose to care or not, about how we express that care, about how those feelings are received by the other person and reciprocated (or not).


Once again, I am invoking the blog post by fellow Head Geek Thomas LaRock: "Relationships Matter More Than Money" (


"When you use the phrase, "It's not personal, it's just business," you are telling the other person that money is more important than your relationship. Let that sink in for a minute. You are telling someone, perhaps a (current, maybe soon-to-be-former) friend of yours, that you would rather have money than their friendship. And while some jerk is now getting ready to leave the comment “everything has a price,” my answer is “not my friends.” If you can put a price on your friendships, maybe you need better ones.


Why are you in IT? Odds are very good it's not for the money. Okay, the money isn't bad, but no matter what the payout is, ultimately it’s probably not enough to keep you coming back into the office day after day. You are in IT for something else. Maybe you like the rush of finding a solution nobody else ever thought of. Or the pure beauty of the logic involved in the work. Or the chance to build something that someone else wanted but couldn't figure out how to make for themselves.


But underneath it all, you are probably in IT because you want to help people in some meaningful way.


That's the IT lesson we can take from Logan. The climax of the movie isn't when Laura shoots X24 in the head with an adamantium bullet.


It's when she clutches Logan's hand as he's dying and cries out, "Daddy!" in her loss and grief, and he accepts both her name and love for him, even if he doesn't feel he's worthy of either.


We are here - on this planet, in this community, at this company, on this team, on this project, doing this job - to forge connections with the people that we meet. To learn, mentor, befriend, lead, help, teach, follow, grow, foster, mentor, and so much more. The rest are just technical details.


1 “Logan” (2017), Marvel Entertainment, distributed by 20th Century Fox

Over the last three posts, we’ve looked at Microsoft event logging use cases and identified a set of must-have event IDs. Now we’re ready to put our security policy in place. This blog will walk you through configuring event logging on client workstations, and creating a subscription on a central log collection device.

Centralizing log collection removes the burden of having to log in to individual workstations during investigations. It also provides a way to archive log data for incident response or compliance requirements. Remember: being able to easily correlate activities across multiple hosts is a powerful threat detection and mitigation tool.


Configuring computers in a domain to forward and collect events

All source devices and the collector should be registered in the domain.


1.Enable Windows Remote Management Service on each source computer by typing the following at an administrator command prompt (select Run as Administrator from the Start menu or use the Runas command at a command prompt):

     winrm quickconfig


    Note:  It is a best practice to use a domain account with administrative privileges.




     Note:  Winrm 2.x uses default HTTP port 5985 and default HTTPS port 5986. If you already have a listener but you want to change the port, run this command:

    Winrm set winrm/config/listener?Address=*+Transport=HTTP @{Port="5985"}

     Then change your Windows firewall policy accordingly.


2. Enable the Windows Event Collection service on the collector computer, type the following at an administrative command prompt (select Run as Administrator from the Start menu or use the Run as command at a command prompt):

     wecutil qc



3. Configure the Event Log Readers Group
Once the commands have been run successfully, go back to the event source computer and open the Computer Management applet from the Server Manager:

Click Start
Right Click
Select Manage


Expand the Local Users and Groups option from the navigation pane and select the Groups folder. Select “Event Log Readers” group, right click and select Add.




In the “Select Users, Computers, Service Accounts or Groups” dialog box, click on the “Object Type” button and select the checkbox for “Computers” and click OK.



Type in the name of the collector computer and click on the “Check Name” button. If the computer account is found, it will be confirmed with an underline.

The computers are now configured to forward and collect events.


4. Create a Subscription

A subscription will allow you to specify the events you want to have forwarded to the collector.

In the Event Viewer on the collector server, select the Subscriptions. From the Action menu in the right pane, choose the “Create Subscription…” link.



In the Subscription Properties dialog box:

a.  Provide a name and description for the subscription.

b. Leave the “Destination log” field set to default value of Forwarded Events.

c. Choose the first option (“Collector initiated”) for subscription type and then click on Select Computers.

d. Click on the “Add Domain Computers…” in the pop-up dialogue box.

e. Type the name of the collector server and verify the name. Click OK twice to come back to the Subscription Properties main dialog box.

f. In the Events to Collect section, click on the “Select Events…” button to bring up the Query Filter window.

g. Select a time period from the “Logged” drop-down list. For client workstations these may be collected on a daily basis, for critical servers, a more frequent schedule should be deployed.

h. Select types of events (Warning, Error, Critical, Information, and Verbose) by eventID (or pick the event sources you require but remember to be selective to avoid losing visibility into important events due to excessive “noise”.

i.  Click OK to come back to the Subscription Properties main dialog box again.

j. Click on the “Advanced…” button and then in the Advanced Subscription Settings dialog box select the option for “Machine Account” if it’s not already selected.


k. Change the “Event Delivery Optimization” option to “Minimize Latency.”

l. Verify the Protocol ports - ideally keep the default value of HTTP and the Port as 5985.

m. Click OK to go back to the Subscription Properties dialog box and then click OK to close it.


The Subscriptions option in the event viewer should now show the subscription we just created.


5. Verify Events on Collector Computer

Select the Forwarded Events option under Windows Logs in the Event Viewer.



Notes for Workgroups

If you want to set up log forwarding within a workgroup rather than a domain you will need to perform the following tasks in addition to those defined for domains:

  • Add an account with administrator privileges to the Event Log Readers group on each source computer. You must specify this account in the dialog when creating a subscription on the collector computer. Select Specific User instead of Machine Account (see step 4j). You must also ensure the account is a member of the local Administrators group on each of the source computers.


  • Type winrm set winrm/config/client @{TrustedHosts="<sources>"}<sources>winrm set winrm/config/client @{TrustedHosts="msft*"}on the collector computer. To learn more about this command, type winrm help config.


Hopefully you have now built a working security policy using Windows Events. In the last blog of this series we will look at combining these events with other telemetry sources in a network by forwarding them to a syslog server or SIEM tool.

I'm getting ready for my trip to Germany and SQL Konferenz next week. If you are near Darmstadt, I hope to see you there. I'm going to be talking about data and database migrations. I'll also make an effort to eat my weight in speck.


As always, here are some links from the Intertubz that I hope will hold your interest. Enjoy!


Amazon overtakes Microsoft in market capitalisation, thanks to booming AWS revenue

This post had me thinking how overvalued Apple is right now. It's as if the stock price of a company has no relation to the actual value of a company, its people, or products.


Apple takes more than half of all smartphone revenues in Q4 2017

Then again, maybe Wall Street does know a few things about Apple. Despite a declining market share in total phones sold (only 18% now), Apple drives huge revenues because their phones cost so much. And Wall Street loves revenues. As long as the iPhoneX doesn't catch fire, Apple stock will continue to be a safe bet.


Intel facing 32 lawsuits over Meltdown and Spectre CPU security flaws

You knew the sharks would come out for this one. Between the lawsuits and the insider trading, this could be the end of Intel.


US Senator demands review of loot box policies, citing potential harm

This is a topic of discussion in our home, as my children ask for money to purchase "loot boxes" so they can win a prize that allows them to compete in MMO game such as Overwatch. They don't understand the gambling aspect of a loot box, and I suspect we now have a generation of kids that won't think twice about spending extra cash for an instant reward. It won't be long before we pay an extra $3 at Starbucks for a chance that our drink is made correctly.


This electric jet can take off vertically and travel almost 190 miles per hour - and it's already being prototyped

Finally! I was promised flying cars decades ago! This would make commuting to work so much more fun.


Hacker Group Makes $3 Million by Installing Monero Miners on Jenkins Servers

This is why we can't have nice things.


Americans used to eat pigeon all the time-and it could be making a comeback

Wrap it in bacon and we'll never know.


Ah, Germany. I came for the speck, but I stayed for the schweinshaxe:

All too often, especially if disaster recovery (DR) is driven and pushed by the IT department, organizations can fall into the common mistake of assuming that they are “good to go” in the event disaster hits. While IT departments can certainly handle the technical side of things, ensuring services are up and running if production goes down, they are not necessarily the key stakeholder in ensuring that business processes and services can also be maintained. These business processes and activities can really be summed up in one key term that goes hand in hand with DR - business continuity (BC). Essentially, business continuity oversees the processes and procedures that are carried out in the event of a disaster to help ensure that business functions continue to operate as normal – the key here being business functions. Sure, following the procedures with our disaster recovery plan is a very big piece of our business continuity plan (BCP), but true BCP’s will encompass much more in terms of dealing with a disaster.


BCP: Just a bunch of little DR plans!


When organizations embark on tackling business continuity, it's sometimes easier to break it all down into a bunch of little disaster recovery plans – think DR for IT, DR for accounting, DR for human resources, DR for payroll, etc. The whole point of business continuity is to keep the business running. Sometimes, if it is IT pushing for this, we fall into the trap of just looking at the technical aspects, when really it needs to involve the whole organization! So, with that said, what should really be included in a BCP? Below, we will look at what I feel are four major components that a solid BCP should consider.


Where to go?


Our DR plan does a great job of ensuring that our data and services are up and running in the event disaster hits. However, often what we don’t consider is how employees will access that data. Our employees are used to coming in, sitting down, and logging into a secure internal network. Now that we have restored operations, does a secondary location offer the same benefit that's available to our end-users? Are there enough seats, DHCP, switches to handle all of this? Or, if we have utilized some sort of DRaaS, do they offer seats or labs in the event we need them? Furthermore, depending on the type of disaster incurred, for instance, say it is was a flood, will our employees even be able to travel to alternate locations at all?


Essential Equipment


We know we need to get our servers back up and running. That’s a no brainer! But what about everything else our organization uses to carry out its day-to-day business? It’s the items we take for granted that tend to be forgotten. Photocopiers, fax machines, desks, chairs, etc. Can ALL essential departments maintain their “business as usual” at our secondary site, either fully or in some sort of limited fashion? And aside from equipment, do we need to think of the infrastructure within our secondary site, as well? Are there phone lines installed? And can that be expanded in the event of long-term use of the facility? Even if these items are not readily available, having a plan on how to obtain them will save valuable time in the restoration process. Have a look around you at all the things on your desk and ask yourself if the same is available at your designated DR facility.




Here’s the reality: your building is gone, along with everything that was inside of it! Do you have plans on how to keep in touch with key stakeholders during this time? A good BCP will have lists upon lists of key employees with their contact information, both current and emergency. Even if it is as simple of having employees home/cell phone numbers listed, and possibly, if you host your own email servers, alternate email addresses that are checked on a regular basis. The last thing you want to have is a delay in the process of executing your BCP because you can’t get the go-ahead from someone because you are simply unable to contact them.


Updated Organizational Charts


While having an updated org chart is great to include within a BCP, it is equally, or perhaps even more important, to have alternate versions of these charts in the event that someone is not available. We may not want to think about it, but the possibility of losing someone within the disaster itself is not far-fetched. And since the key function of the BCP is to maintain business processes, we will need to know exactly who to contact if someone else is unavailable. The last thing we need at times like these is staff arguing, or worse, not knowing who will make certain key decisions. Having alternate org charts prepared and ready is critical to ensuring that recovery personnel has the information they need to proceed.


These four items are just the tip of the iceberg when it comes to properly grafting a BCP. But there is much more out there that needs to be considered. Paper records, back-up locations, insurance contacts, emergency contacts, vendor contacts, payroll, banking; essentially every single aspect of our business needs to have a Plan B to ensure that you have an effective, holistic, and more importantly, successful Business Continuity Plan in place. While we as IT professionals might not find these things as “sexy” as implanting SAN replication and metro clusters, the fact of the matter is that we are often called upon when businesses begin their planning around BC and DR. That’s not to say that BC is an IT-related function, because it most certainly is not. But due to our major role in the technical portion of it, we really need to be able to push BC back onto other departments and organizations to ensure that the lights aren’t just on, but that there are people working below them as well.


I’d love to hear from some of you that do have a successful BCP in place. Was it driven by IT to begin with, or was IT just called upon as a portion of it? How detailed (or not) is your plan? Is it simply, “Employees shall report to a certain location,” or does it go as far as prioritizing the employees who gain access? What else might you have inside your plan that isn’t covered here? If you don’t have a plan, why not? Budget? Time? Resources?


Thank you so much for all of the recent comments on the first two articles. Let's keep this conversation going!

No, it’s not the latest culinary invention from a famous Italian chef: spaghetti cabling (a nice wording for cabling inferno) is a sour dish we’d rather not eat. Beyond this unsavory term hides the complexity of many environments that have grown organically, where “quick fixes” have crystallized into permanent solutions, and where data center racks are entangled in cables, as if they had become a modern version of Shelob’s Lair.


These cabling horrors are not an act of art. Instead, they prosaically connect systems together to form the backbone of infrastructures that support many organizations. Having had experience in the past with spaghetti cabling, I can very vividly remember the endless back-and-forth discussions with my colleagues. This usually happened when one was trying to identify the switch port to patch panel connectivity while the other was checking if the system network interface is up or down. That then resulted in trying to figure out if patch panel ports were correctly mapped with wall outlet plug identification. All of this to troubleshoot a problem that would be very trivial if it wasn’t for careless and unprofessional cabling.


The analogy with other infrastructure assets is very similar: it can be very difficult for administrators to find a needle in the haystack, especially when the asset is not physical and the infrastructure is large. Multi-tiered architectures, or daisy-chained business processes relying on multiple sources of data, increase potential failure points in the data processing stream. This sometimes makes troubleshooting a far more complex endeavor than it used to be due to upstream or downstream dependencies.


One would expect that upstream dependencies would impact a system in such a way that it is no longer able to process data, and thus come to a halt without impact to downstream systems. While this can be a safe assumption, there are also cases where the issue isn’t a hard stop. Instead, the issue becomes data corruption. Either by handing over incorrect data or by handing over only fragments of usable data. In such occurrences, it is also necessary to identify the downstream systems and stop them to avoid further damage until the core issue has been investigated and fixed.


Thus, there is a real need for mapping the upstream and downstream dependencies of an application. There are cases in which it’s preferable to bring an entire system to a halt rather than risk financial losses (and eventually litigation, not to mention sanctions), if incorrect data makes its way into production systems. In that case, it would ultimately impact the quality of a manufactured product (think critical products, such as medicines, food, etc.) or a data batch meant for further consumption by a third party (financial reconciliation data, credit ratings, etc.).


Beyond troubleshooting, it’s crucial for organizations to have an end-to-end vision of their systems and assets, preferably into a System of Record. This could be for inventory purposes or for management processes, whether based on ITIL or not. The IT view is not always the same as the business view, however both are bound by the same common goal: to help the organization deliver on its business objectives. The service owner will focus on the business and process outcomes, while the IT organization will usually focus on uptime and quality of service. Understanding how assets are grouped and interact together is key in maintaining fast reaction capabilities, if not acting proactively to avoid outages.


There is no magical recipe to untangle the webs of spaghetti cabling, however advanced detection/mapping capabilities. Existing information in the organization should help IT and business obtain a precise map of existing systems, and understand how data flows in and out of the system with a little detective work.


In our view, the following activities are key enablers to obtain full-view clarity on the infrastructure:

  • Business service view: the business service view is essential in understanding the dependencies between assets, systems, and processes. Existing service maps and documentation, such as business impact assessments, should ideally contain enough information to capture the process view and system dependencies.


  • Infrastructure view: it is advisable to rely on infrastructure monitoring tools with advanced mapping / relationship / traffic-flow analysis capabilities. These can be used to complement/validate existing business service views listed above (for lucky administrators / IT departments), or as a starting point to map traffic flows first, then reach out to business stakeholders to formalize the views and system relationships.


  • Impact conditions and parent-child relationships: these usually would be captured in a System of Record, such as a CMDB, but might eventually be also available on a monitoring system. An event impacting a parent asset would usually cascade down to child assets.


  • Finally, regular service mapping review sessions between IT and business stakeholders are advised to assert any changes.


Taken in its tightest interpretation, the inner circle of handling “spaghetti cabling” problems should remain within the sphere of IT Operations Management. However, professional and conscious system administrators will always be looking at how to improve things, and will likely expand to the other activities described above.


In our view, it is an excellent way to further develop one’s skills. First, by going above and beyond one’s scope of activities, it can help us build a track record of dependability and reliability. Second, engaging with the business can help us foster our communication skills and move from a sometimes tense and frail relationship to building bridges of trust. And finally, the ability to understand how IT can contribute to the resolution of business challenges can help us move our vision from a purely IT-centric view to a more holistic understanding of how organizations work, and how our prioritization of certain actions can help better grease the wheels.

By Paul Parker, SolarWinds Federal & National Government Chief Technologist


I like the idea of taking a holistic view of the user experience. Here's an interesting article from my colleague Joe Kim, where he introduces and discusses Digital Experience Monitoring (DEM).


Agencies are moving quickly from paper processes to digital services to providing critical information more efficiently online, rather than paper-based forms and physical distribution methods. As a result, about 30% of global enterprises will implement DEM technologies or services by 2020—up from fewer than 5% today, according to market research firm Gartner®.


What, exactly, is Digital Experience Monitoring? In a nutshell, it’s understanding and maximizing each individual user’s online experience.


DEM looks at the entire user experience: how fast did the home page load? Once it loaded, how much time did the user spend on the site? Where did they go? What did they do? Taking DEM even further, many agencies will gather information about the user’s device to help further understand the user experience: was the user on a smartphone or on a laptop? What browser?


Maximizing the user experience requires an incredible amount of data. This brings its own challenge: all that data can make relevant information difficult to find. Additionally, federal IT pros must be able to understand how the IT infrastructure impacts service delivery and the citizen experience.


Luckily, there are an increasing number of new tools available that help give context to the data and help the federal IT pro make highly informed decisions to maximize each citizen’s digital experience.


DEM tool benefits


DEM-specific tools provide a range of benefits that other tools do not. Specifically, because DEM inherently works with lots of data, these DEM tools are designed to help solve what have historically been thought of as big-data challenges.


For example, DEM tools have the ability to recognize patterns within large amounts of data. Let’s say a specific cluster of users is having a sub-optimal experience. Automatic pattern recognition will help the federal IT pro understand if, say, all these users are taking a particular route that is having bandwidth issues. Or, perhaps all these users are trying to access a particular page, form, or application on the site. Without the ability to recognize patterns among users, it would be far more difficult to find the root of the problem and provide a quick solution.


A DEM-specific tool can also identify anomalies, a historically difficult challenge to find and fix.

First, the federal IT pro must create a baseline to understand ordinary network behavior. With that in place, an anomaly is easier to identify. Add in the ability to apply pattern recognition—what happens before the anomaly each time it appears—and the problem and solution are far easier to find and implement.


And finally, because they can provide a historic perspective, DEM-specific tools can help the federal IT pro forecast infrastructure changes before implementation. Let’s say an agency is undergoing a modernization effort. Many DEM tools provide the ability to forecast based on the baseline and historic information already collected. A solid DEM tool will allow the federal IT pro to mimic changes, and understand the result of those changes throughout the infrastructure, in advance. The reality is, any infrastructure change can impact user experience, so being able to understand the impact in advance is critical.




Federal IT pros have been using performance monitoring tools for years. That said, the landscape is changing. Using ordinary tools—or, ordinary tools alone—may no longer be an option. It is important to understand the role DEM plays within agency IT departments. In turn, this allows you to recognize the value in bringing in the right tools to help perform this new, critical role.


Find the full article on our partner DLT’s blog Technically Speaking.

It was a very full week at CiscoLive--not to mention an additional full week in Spain, which I'll get to in a minute--and I have a lot to share.


First and foremost, and this is not meant to be a slam on Munich, I had an amazing time just BEING in Barcelona. Sure it was a little warmer. Sure, I speak a little Spanish as opposed to zero German. And sure, there were three kosher restaurants instead of the one in Munich. But even beyond that, the pace, the layout, and even the FEEL of the place was different for me in a very enjoyable way. I was incredibly happy to hear that CLEUR will be in Barcelona again next year, and hope that I get to be part of the "away team" again.


The Big Ideas

At every convention, I try to suss out the big themes, ideas, and even products that make a splash at the show. Here's what I found this time:


DevNet! DevNet! DevNet!
I think I talk about DevNet after every CiscoLive, but gosh darn if it's not noteworthy each time. This year, my fellow Head Geek Patrick Hubbard rightly called out the announcement about IBN. No, read it again: NOT big blue. Intent-Based Networking: The upshot of this announcement is that the network is about to get smarter than ever, using data, modeling, and (of course) built-in tools to understand and then ensure the "intent" of the networking you have in place. And how will you interact with this brave new intent-based world? Code.

This leads me to my second big observation:
The time for SDN has come

Every year (since 2014) I've been trying to figure out how SDN fits into the enterprise. Usually when I talk to a group, I give it a shot:

    • "How many of you are thinking about SDN" (usually, most of the hands go up)
    • "How many are using SDN in the lab?" (in most cases, one-half to two-thirds of the hands go down)
    • "How many are using it in prod?" (typically all but three hands go down, leaving just the folks who work for ISPs)


This time I had a ton of people--enterprise folks--coming and asking about SDN and Cisco ACI support, which tells me that we have hit a tipping point. I have a theory why (grist for another article), but it boils down to two main things. First, Cisco has done a kick-ass job pushing "DevNet" and teaching network folks of all stripes not to fear the code. People came to the booth asking "does this support python scripting?" Scripting wasn't an afterthought; it was a key feature they needed. Second, SDN experience has filtered down from networking engineers at ISPs to mid-level technicians, and companies are now able to enumerate the value of this technology both on a technical and business level. Thus, the great corporate adoption of SDN is now starting.


Being a NetVet is every bit as cool as I thought it would be
Besides causing vendors to stare at your badge for an extra two seconds, the biggest benefit of being a NetVet is the lounge. It is quiet. It has comfy couches. It has it's own coffee machine. It. Has. My. Name. On. It.


The View from the Booth

So that sums up the major things I saw at the show. But what about the interactions in the SolarWinds booth? SO MUCH was packed into the three days that it's hard to pick just a few, but here goes.


TNG, and I don't mean Star Trek
One of the fun things about a show like CiscoLive is getting to show off new features and even whole new solutions. Three years ago I got to stand on stage with Chris O'Brien and show off "something we've been playing with in the lab," which turned out to be NetPath. This time, we had a chance to get initial reactions to a new command line tool that would perform traceroute-like functions, but without ICMP's annoying habit of being blocked by... well, just about everything. While we're still putting on the final coat of paint, the forthcoming free "Traceroute NG" tool will perform route analysis via TCP or traditional ICMP,  show you route changes if the path changes during scanning, supports IPv4 and IPv6 networks, and more. Attendees who saw it were blown away.


Hands Up for BackUp!

We also got to take the lid off an entirely new offering: cloud-based backup for your important systems. ( This isn't some "xcopy my files to the cloud" kludge. Using block-based backup techniques for screaming fast (and bandwidth-friendly) results; a simple deployment strategy that supports Windows and Linux-based systems; granular permissions; and a dashboard that lets you know the disposition of every system, regardless of the size of your deployment.


Survey Says?
A great part of booth conversations is comparing experiences and discovering how frequently they match up. This frequently comes out as a kind of IT version of Mad Libs.

  • I was discussing alerts and alert actions with an attendee who was clearly part of "Team Linux." After pointing out that alerts should extend far beyond emails or opening tickets, I threw out, "If your IIS-based website is having problems, what's the first thing you do?" Without even a pause they said, "You restart the app pool." That's when I showed SAM's built-in alert actions. (Afterward we both agreed that "install Apache" was an equally viable answer.)
  • When Patrick asked a group of four longtime SolarWinds users to guess the most downloaded SolarWinds product, the response was immediate and emphatic: "TFTP Server." I could only laugh at how well our customers know us.


"I'm here to ask question and chew bubblegum (and it doesn't look like you're giving out bubblegum)"
As I have noted in the past, CiscoLive Europe may be smaller (14k attendees versus ~27k in the United States), but the demos go longer and the questions are far more intense. There is a much stronger sense of purpose when someone comes to our booth. They have things they need to find out, design choices they want to confirm, and they don't need another T-shirt, thank you very much. Which isn't to say we had swag left at the end. It was all gone. But it took until the last day.


More Parselmouth's than at a Slytherin Convention
This year I was surprised by how often someone opened their questions with, "Do these solutions support Python?" (For the record, the answer is yes: Not that I was surprised to be asked about language support in general. What got me was how often this happened to be the opening question. As I said earlier, Cisco's DevNet has done an incredible job of encouraging the leap to code, and it is now framing many networking professional's design choices and world view. I see this as a good thing.


La Vida Barcelona

Outside of the hustle and bustle of the convention center, a whole world awaited us. As a polyglot wannabe, the blend of languages was multicultural music to my ears. But there wasn't much time to really see the sites or soak up the Spanish culture because the convention was demanding so much of my day.


Which is why I decided to spend an extra week in-country. My wife and I traveled from Barcelona to Madrid, and even spent a day in Seville to visit the apartment where she was born and spent the first few months of her life.


We saw some amazing sites:


Including some views that GoT fans like jennebarbour will find familiar:




Ate some incredible food:


And generally just enjoyed all that Spain had to offer. The only hiccough was the weather. It was kind of like this.


For Clevelanders like us, it's pretty normal. But I'm pretty sure the locals suspected we brought our weather with us, and were glad to see the back of me when we finally packed up and headed back home.


Until next year (which will be in Barcelona again), and until the next trip.

(pictured: patrick.hubbard, ding, andre.domingues, and the inimitable Silvia Siva.)

Most network engineers enter the profession because we enjoy fixing things.  We like to understand how technology works.  We thrive when digging into a subject matter with focus and intensity.  We love protocols, acronyms, features, and esoteric details that make sense only to a small group of peers.  Within the ranks of fellow geeks, our technical vernacular becomes badge of honor.  However, outside of our technical peers, we struggle to communicate effectively.


Our organizations rely on technology to make the business run.  But the deeper we get technically, the wider the communication gap between with IT and business leadership becomes.  We must learn to bridge the gap between technology and business to deliver the right solutions.


I was reminded of this communication disparity when working on a circuit outage recently.  While combing through logs and reviewing interface statics, a senior director asked for a status update.  I told him, “We lost BGP on our Internet circuits".  He responded with a blank stare.  I had failed to shift my communication style and provided zero helpful information to my leadership.  I changed my approach and summarized that we lost the logical peering with our provider.  Although the physical circuit appeared to be up, we could not send Internet traffic because we were no longer receiving routing information from them.  My second response, though less precise, provided an understandable picture to my senior leadership and satisfied his question.  He had more confidence that I knew where the problem was and it helped him understand what the escalation point should be.


When communicating with leadership about technical projects and problems remember these things.


  1. Leadership doesn’t understand your jargon and they don’t need to.  I’ve heard many network engineers decry the intelligence of their leadership because management doesn't know the difference between an ARP table and a MAC address table.  This line of thinking is silly.  Management’s role is to understand the business, build a healthy organization, manage teams effectively, and provide resources to accomplish business goals.  Some degree of technical knowledge is helpful for front-line management, but the higher in the organization an individual is, the less detail they will need know about each technical arena.  This is as it should be.  It’s your job to know the difference between an ARP table and a MAC address table and to summarize technical detail into actionable information.
  2. Management doesn’t always know the exact right question to ask.  I once had a manager describe a colleague as an individual who would provide only data, never analysis.  My boss felt as though he had to ask 30 questions to get a handle on the technical situation.  My colleague thought his sole responsibility was to  answer the question precisely as asked — regardless of the value of that answer.  Don’t be that guy or gal.  Listen carefully and try to understand what you manager wants to know instead of parsing their words precisely.  Answer their question, then offer insight that you believe will help them do their job better.  Be brief, summarize, don’t include so much technical detail that they check out before you get to the punchline.
  3. Effective communication is an art, more than a science.  At the end of the day, great communication happens in the context of strong professional relationships.  You don’t have to be best friends with your manager and you don’t need to spend time with them outside of the office.  However, you should work hard — as much as it depends on you — to build trust and direct, respectful communication channels with your leadership.  Don’t dig in your heels unnecessarily.  Give when you can and hold firm when you must.  If you develop a reputation as a team player, your objections will be taken more seriously when you must voice them.


Strong communication skills are the secret weapon of truly effective network engineers.   If you want to grow in influence within your organization, and you want to truly affect change, you’ll need to sharpen your soft skills along with your technical chops.

It’s a common story. Your team has many times more work than you have man hours to accomplish. Complexity is increasing, demands, are rising, acceptable delivery times are dropping, and your team isn’t getting money for more people. What are you supposed to do? Traditionally the management answer to this question is outsourcing but that word comes with many connotations and many definitions. It’s a tricky word that often instills unfounded fear in the hearts of operations staff, unfounded hope in IT management, and sometimes (often?) works out far better for the company providing the outsourcing than the company receiving the services. If you’ve been in technology for any amount of time, you’re likely nodding your head right now. Like I said, it’s a common story.


I want to take a practical look at outsourcing and, more specifically, what outsourcing will never solve for you. We’ll get to that in a second though. All the old forms of outsourcing are still there and we should do our best to define and understand them.


Professional outsourcing is when your company pays someone else to execute services for you and is usually because you have too many tasks to complete with too few people to accomplish them. This type of outsourcing solves for the problem of staffing size/scaling. We often see this for help desks, admin, and operational tasks. Sometimes it’s augmentative and sometimes it’s a means to replace a whole team. Either way I’ve rarely seen it be something that works all that well. My theory on this is that a monetary motivation will never instill the same sense of ownership that is found in someone who is a native employee. That being said, teams don’t usually use this to augment technical capacity. Rather, they use it to increase/replace the technical staff they currently have.


Outside of the staff augmentation style of outsourcing, and a form that usually finds more success, is process specific outsourcing. This is where you hire experts to provide an application that doesn’t make sense for you to build, or to do a specific service that is beyond reasonable expectation of handling yourself. This has had many forms and names over the years, but some examples might be credit card processing, application service providers, electronic health record software, etc…  Common modern names for this type of outsourcing is SaaS (Software-as-a-Service) and PaaS (Platform-as-a-Service). I say this works better because it’s purpose is augmenting your staff technical capacity, leaving your internal staff available to manage product/service.


The final and newest iteration of outsourcing I want to quickly define is IaaS (Infrastructure-as-a-Service) or public cloud. The running joke is that running in the cloud is simply running your software on someone else’s server, and there is a lot of truth in that. How it isn’t true is that the cloud providers have mastered the automation, orchestration, and scaling in the deployment of their servers. This makes IaaS a form of outsourcing that is less about staffing or niche expertise, and more about solving the complexity and flexibility requirements facing modern business. You are essentially outsourcing complexity rather than tackling it yours.


If you’ve noticed, in identifying the above forms of outsourcing above I’ve also identified what they truly provide from a value perspective. There is one key piece missing though and that brings me to the point of this post. It doesn’t matter how much you outsource, what type of outsourcing you use, or how you outsource it, the one thing that you can’t outsource is responsibility.


There is no easy button when it comes to designing infrastructure and none of these services provide you with a get out of jail free card if their service fails. None of these services know your network, requirements, outage tolerance, or user requirements as well as you do. They are simply tools in your toolbox and whether you’re augmenting staff to meet project demands, or building cloud infrastructure to outsource your complexity, you still need people inside your organization making sure your requirements are being met and your business is covered if anything goes wrong. Design, resiliency, disaster recovery, and business continuity, regardless of how difficult it is, will always be something a company will be responsible for themselves.


100% uptime is a fallacy, even for highly redundant infrastructures run by competent engineering staffs, so you need to plan for such failures. This might mean multiple outsourcing strategies or a hybrid approach to what is outsourced and what you keep in house. It might mean using multiple providers, or multiple regions within a single provider, to provide as much redundancy as possible.


I’ll say it again, because I don’t believe it can be said enough. You can outsource many things, but you cannot outsource responsibility. That ultimately is yours to own.

Filter Blog

By date: By tag:

SolarWinds uses cookies on its websites to make your online experience easier and better. By using our website, you consent to our use of cookies. For more information on cookies, see our cookie policy.