The title of this post says it all. In IT, we’ve said for years you need to take and test backups regularly to ensure your systems are being backed up correctly. There’s an adage in IT: “The only good backup is a tested backup.”


Why Do You Need Backups?


I bring this up because I have people coming to our consulting firm after some catastrophic event has occurred, and the first thing I ask for are the backups. Then the conversation usually gets awkward as they try to explain why there are no backups available. The reasons usually run the gamut from “we forgot to set up backups” to “the backup system ran out of space” to “the backup system failed months ago, and no one fixed it.”


Whatever the reason, the result is the same: there are no backups, there’s a critical failure, and no one wants to explain to the boss why they can’t recover the system. The system is down, and the normal recovery options aren’t available for one reason or another. In these cases, when there’s no backup available, the question becomes, “How critical is this to your business staying operational?” If it’s a truly critical part of your infrastructure, then the backups should be just as critical to the infrastructure as the system is. Those backups need to be taken and then tested to ensure the backup solution meets the needs of the company (and that the backups are being taken).


When planning the backups for a key system, the business owners need to be involved in setting the backup and recovery policies; after all, they’re responsible for the data. In IT terms, this is the Recovery Point Objective (RPO) and the Recovery Time Objective (RTO). In layman’s terms, these are how much data can the organization afford to lose and how long it takes to bring the system back online. These numbers can be anything they need to be, but the smaller they are, the larger the financial cost will be in completing this request. If the business wants an RPO and RTO of 0, and the business is willing to pay for it, then it’s IT's job to complete the request even if we don’t agree with it. And that means running test restores of those backups frequently, perhaps very frequently, as we need to ensure the backups we’re taking of the system are working.


Why Is It Important to Test Backups?


Testing backups should be done no matter what kind of backups are being taken. If you think you can skip on test restores because the backup platform reports the backup was successful, then you’re failing at taking backups. One of the mantras of IT is “trust but verify.” We trust that the backup software package did the backup it says it was going to do, but we verify it did the backup by testing it. If there’s a problem where the backup can’t be restored, it’s much better to find out when doing a test restore of the backup than when you need to restore the production system. If you wait until the production system needs to be restored to find out the backup failed, there’s going to be a lot of explaining to do as to why the system can’t be restored and what sort of impact that might have on the business—including, potentially, the company going out of business.

(Disclaimer: This post was co-written by Leon and Kevin and presents a point/counterpoint style view of Cisco Live US 2019. But without either calling the other “ignorant” or insulting their choice in quantity of romantic liaisons.)


Kevin M. SparenbergThis year, once again, it was my absolute pleasure to be part of the staff that attended Cisco Live! US. Part of my time was spent in the booth and the rest of my time was spent in various sessions to catch up on everything from the last year. If there’s one thing that really resonated with me this year, even more than previous, was how quickly IT is moving.
This year, in a flurry of last-minute activity, I discovered I would be attending Cisco Live. Because of its proximity to the Jewish holiday of Shavuot (which is most notable for commandment to consume vast quantities of cheesecake. I regret nothing.) I didn’t think I’d be attending. But I did, and I’m extremely happy the powers-that-be at SolarWinds helped make that happen. Because let me tell you, times, they are a-changing!Leon Adato
Kevin M. Sparenberg

At the SolarWinds® booth, we talked about all the greatest additions and updates to the portfolio of products. In the last year, SolarWinds has released 32 updates to existing products or completely new products. To say it’s been a busy year may just be a tad bit of an understatement. For booth visitors, we got to show off some of the new enhancements to our network management portfolio and retread some of the past favorites (looking at you here, NetPath).

It really should come as no surprise that “it’s been a busy year.” They’re all busy. You folks keep telling us about persistent problems and challenges, and that’s what we set out to fix. If you’ve been following these Cisco Live reports for even a couple of years, you know that this is the time when we release those solutions into the wild.

Leon Adato
Kevin M. SparenbergThis year, more than most, I spoke with more people whose title didn’t include the word “network.” On the surface this would seem odd, but more and more organizations are recognizing that networks aren’t what their end users interact with, it’s the highway on which those interactions happen. Having a blazing fast network is great but doesn’t help if you are connecting to a server which takes 1,500 milliseconds to even send back the first bit of data. If that happens, you’re still going to have a horrible experience. As a former network engineer, I know the network is always blamed first, but how often is it really a network problem?

From MS Ignite to VMWorld to Cisco Live, I’ve been noticing that the same people are at every show. I don’t mean the same EXACT people, like some weird convention groupies or something. I mean that rather than having strictly systems folks coming to Ignite, virtualization admins at VMWorld, and network engineers at Cisco Live, the people who visit are concerned with multiple (or all) areas of their environment. They’re as likely to talk about virtual database instances on vSANs as they are about load balancing or traditional route/switch.

Leon Adato

Day One


Kevin M. Sparenberg

But those that know us, know that like to have fun at these events. One of those things was #KiltedMonday, which is a tradition that the Monday of Cisco Live! US, attendees are encouraged to wear their kilts instead of the blasé pants. I added a kilt to my ensemble a few years ago, and I kept it strong this year. As good as it was to be able to show off my shins, it was good to speak with everyone who came up to the booth. But the best part, the absolute best part, was introducing some new SolarWinds people to the booth experience.


We had a handful of “raw” recruits this year. This wasn’t just their first Cisco Live, this was their first ever event with SolarWinds. Monday is always a mad dash for swag, new feature demos, and saying “hi” to old friends. All I can say is that the new recruits did a smashing job at handling the flow, pivoting on topics, and responding to questions. Day one is typically the most hectic and a good trial-by-fire. Not too much unlike starting a new job in IT.

Monday was still a holiday for me.

Leon Adato
Kevin M. Sparenberg

Status update: Voice is already scratchy. Must come up with a regiment so I don’t “talk” myself hoarse on the first day.

Day Two


This was my first day on the show floor, and I have to tell you there’s no convention like Cisco Live. It’s hard to describe the intensity of the folks who come to the booth, the level of depth we get to go into in our conversations and demos, and the sheer love—of the product, of the philosophy of monitoring, and of the team. I’ve taken to ask visitors to the booth what THEY think the “big story” of Cisco Live is. On Day 2, all anyone could talk about was the changes to Cisco Certification program. Immediately after the keynote, there were a lot of emotions… ahem… “passionately held and strongly worded opinions” across the convention floor, but by the end of the day the prevailing wisdom could be summarized as:

  1. It’s not changing immediately
  2. Most certifications aren’t going away
  3. This makes it easier for certain folks (especially CCIE’s) to re-up their cert

…and thus, calm was restored to the masses.

Leon Adato
Kevin M. Sparenberg

On day 2, we held a SolarWinds User Group™ at Roy’s Restaurant right next to the San Diego Conference Center. The venue was gorgeous with vistas overlooking the marina and fantastic Polynesian fusion food. The signature mai tais were not to be missed. At that event, Chris O’Brien, Group Product Manager for the SolarWinds Network Management portfolio talked about all the goodness we delivered the week before. Then it was up to me to talk about everything else that SolarWinds released in the last year. Needless to say, I didn’t cover everything, or I’d still be there talking. Part of SWUG™ is the presentations, but most of SWUG is the conversations between customers. The best, most important part of our community is you, the people who work with our solutions every day. I’m constantly amazed at how gracious and courteous you are with any newcomers to the THWACK® community. You share your expertise, tips, tricks, and more with anyone willing to ask. If I had the opportunity to shake hands with every member who has helped me over the years, I’d do it in a heartbeat and be humbled by the experience. This is just one reason that I feel this is the greatest community in IT. With day two wrapping, I was both exhausted and energized for the rest of the week.

Day Three


Kevin M. Sparenberg

The post-SWUG hangover is always a rough day. It’s not to say that it’s a true hangover, but the exhilaration of speaking with so many people can make the follow-up day feel just a little bit longer. Thankfully, there was a new wave of swag in the booth, which makes the customer-in-me always happy. Today’s “word” of the day was API and I don’t care how you pronounce it (cue Leon groaning at me). I find it very profound that technology is so cyclical. When we started, everything was command line interface only, then we moved onto a GUI, and now we’re back to command line. It’s just one of those oddities that I’ve picked up being in the industry for so many years. As new technology comes out, there’s a push, nearly a requirement, to make sure that you can attach to it via a programmable interface.

The big story from the floor was the way Cisco is adopting a “Meraki style” process to integrating new features into their product. What I mean by that is this: When you have Meraki wireless controllers, when a new feature comes out, the software updates in the cloud and BOOM! you’re in business. And honestly, this has been one of the big complaints of CLUS attendees for several years. They come to the show, get all hyped up on the latest tech (including DNA and intent-based networking), and then go back to their office, only to realize that the molecules they have on site (i.e., the physical gear) can’t handle it, and won’t be retired for several years. While the bean counters may complain about subscription-based models, the fact that it will get the new code into our shops is going to change the speed at which new technologies, protocols, and techniques are adopted.

Leon Adato

Day Four


Kevin M. Sparenberg

The last day on the show floor for the booth staff brought both sighs of relief and sadness at parting. I love talking to people, so the final day always makes me a little sad. I love hearing how everyone is handling their own IT infrastructure. No one has infinite time, so you’ll never get to work hands-on with every technology. Being able to hear what various attendees got from the event is always a high point. Yes, there were product announcements, new technologies, and enhancements abounded, but the real thing for me is the community that happens at events. Part of that is connecting with other people. Those people could be coworkers from another office, other IT professionals you’ve gone to for advice, or new people you meet for the first time. One of the ways that I leveraged this event to reconnect was to go to lunch with my friend and co-worker, Leon, where we discussed our personal findings on the event. That, in part, was the impetus of this post.

One attendee told me he thought that “the ubiquity of the story around ‘umbrella’” was the big story for the show. Which is awesome (because #SECURITY!) but it’s also amusing, because you can find “Introducing Umbrella” as a session at every Cisco Live (US, EUR, LATAM, and APAC) back to 2016. Even at SolarWinds, we know that sometimes you have to keep talking about an idea for a bit before people start to notice it (looking at you, AppStack™). The other story worth noting was just how busy the expo floor remained. We had folks coming into the booth asking serious questions, and looking for in-depth demonstrations, up to 30 minutes AFTER they closed the show down. Now THOSE are some dedicated IT professionals!

Leon Adato

The Completely Un-Necessary Summary


Kevin M. Sparenberg

At the end of the week, the biggest takeaway for me was the number of non-network people walking around Cisco Live. This concerned a handful of the booth staff, some of the other vendors, and many of the attendees, but not me. For me, this was the year when topics started shifting from an us vs. them to more of an us with them focus. Functional silos are great for training and specialization, but horrible for troubleshooting real-world issues. The number of systems administrators, security operations professionals, and monitoring engineers this year was up and that made me smile. Attendees inquiring about how complete stack monitoring can help them get to root causes (or at least to stop spinning their wheels chasing red herrings) was an absolute pleasure to hear. Even if the problem isn’t in your wheelhouse, ignoring it doesn’t make it go away. Good tools can help pinpoint the problem and get you back to happy.

Next up for me is the New York SWUG and I’m looking forward to connecting with people again at that event. Hopefully by that time, my voice will have returned to its pre-event levels. Cisco Live may be over for this year, but that doesn’t mean the lessons learned or the connections made will go away.

As Kevin and I both noted, Cisco Live is unlike any other show we attend. At the same time, this year was different in one notable way—while every other year the visitors to our booth were both steady (the booth really never got empty) and relentless (we’d usually be doing a demo and have one or more people queueing up for a demo after we were done); this year we saw distinctive ebbs and flows. At first I worried that SolarWinds had somehow lost its mojo, but then we saw that the rest of the expo floor was equally barren. By the third day, I was asking around for reasons. It turns out that there was a rumor going around, that if you badged into a session and left early, the RFID sensors would pick it up and you wouldn’t get credit for attendance. That kept butts in chairs when they might otherwise have used the time to talk to your favorite monitoring enthusiasts. We’ll have to wait until next year to see if that rumor continues to hold sway.

Speaking of next year, mark your calendar and start saving your pennies. Cisco Live returns to Las Vegas, running from May 30 – June 4. For those of you with a Jewish holiday calendar handy, that means it starts the day AFTER Shavuot. I’ll make sure I have plenty of leftover cheesecake on hand.

Leon Adato

Those are our thoughts. What was your experience?

The hybrid cloud makes everything better. It's helping drive down costs, it's making the software we use to run our data centers easier, and it's pushing innovation. The public cloud has stolen the headlines for years, and for good reasons. A lot of exciting innovations in the public cloud have been happening and will continue to happen. Even though innovation excels in the public cloud, we find ourselves drawn back to our private data centers because companies need access to the data in those private data centers. This data fuels the business. Traditional technology companies that have focused on private data centers have listened and are starting to adopt a hybrid methodology.


1. Hybrid Cloud Can Reduce Cost

Competition is a good thing. It pushes everyone to do their best, to listen to the customer's request, and provides choice for consumers. Choice lets consumers decide between two or more products. It allows consumers to evaluate features of multiple products and decide if the cost is worth the investment. Private data center customers have more choices today than ever before. Customers can save money by reducing their hardware footprint. Servers in a private data center are paid upfront and are provisioned based on a company's peak utilization. The cost is the same whether workloads are idle or at maximum capacity. Additional server utilization may be seasonal or is only needed during a specific timeframe. Hybrid cloud allows us to stop over-provisioning workloads on-premises and move that extra compute power to the cloud, thus helping reduce costs because of the new options the hybrid cloud provides.


2. Reduction in Complexity

Every IT shop wants a well-maintained infrastructure. We don't want servers to go down, switches to burn out, or applications to stop. There's a lot of moving parts when you own the whole stack. You have to worry about cooling the servers, powering equipment, protecting equipment from theft or damage—and this is just layer one. The public cloud manages everything at the physical layer, so that's one less thing IT groups have to worry about. The real advantage of reducing complexity is when you can offload everything below layer seven, and only worry about the application.


Public cloud and Software-as-a-Service companies provided this new way of allowing customers to access their applications and data, but without the need to maintain the underlying infrastructure. Traditional data center companies have taken note and are providing the same level of service. These vendors are now providing their services as a cloud-based offering where the control plane and infrastructure hardware is updated and maintained by them. Offloading software upgrades from the on-premises systems administrators and for the admins to not worry about 2 a.m. service calls because of a hardware failure. Isn't this what we're after? Trying to reduce the complexity of our environments.


3. A Platform for Business Innovations

Speed is the name of the game in today's business world. Getting new products and services out and into the customer's hands is everyone's goal today. Public cloud enables companies to test new services faster than ever before. Services in the cloud are ready to be consumed and can be tested with the swipe of a credit card. This allows companies to test new services, fail, and move on to test the next service faster. No more waiting for hardware to arrive and be provisioned for a single test that could last weeks, if not months. Instead, this time is used to test and develop new services.


Innovation doesn't always come in the form of technology. From a financial perspective, the cloud provides the ability to purchases services as an operating expense. On-prem equipment is usually purchased as a capital expense. Capital expenses (CapEx) are usually paid upfront and depreciate over multiple years. Operating expenses (OpEx) are used to run the business and are paid monthly or annually. Items classified as CapEx take longer to get approvals from management or require C-level approval because all the money is needed upfront for those purchases. When IT requests a purchase for hardware, the business needs all the money now. Public cloud is an OpEx consumption model, and data center vendors are helping customers move away from the CapEx model and into the OpEx model. Data center vendors are providing consumption-based models to on-prem data center customers and offering them a pay as you go model, which closely represents a cloud purchases model of paying for only what you use and gives the flexibility to consume additional resources during peak times, but for private data centers.



The rise of public cloud has given companies the ability to quickly test out new services to see if they meet desired requirements with little financial investment from the organization. A reduction in timing and ability to fail fast allows the company to find the right choice and get their own services to market faster. Linking the public cloud to the private data center so cloud services can access company data is the ideal architecture. This hybrid cloud model is helping people get excited about IT again. It's allowed for new services to be explored and offered to customers. The customers are getting better products, sometimes at a reduced price. With flexibility comes complexity. Moving our daily tasks to a more managed service offering will offer less complexity because there's less to worry about. These new options are forcing old technology companies to innovate and bring new offerings to their customers. These new offerings don't have to be technical—they can be new payment offerings. No matter how you look at it, the hybrid cloud just makes everything better. 

Home after a couple weeks on the road between Cisco Live! and Data Grillen. My next event is Microsoft Inspire, and if you're attending, please stop by the booth so we can talk data.


As always, here are some links from the Intertubz that I hope will hold your interest. Enjoy!


Meds prescriptions for 78,000 patients left in a database with no password

This is the second recent breach involving a MongoDB and underscores the need for consequences to those who continue to practice poor security methods. Until we see stiffer penalties to the individuals involved, you can expect those rockstar MongoDB dev teams to get new jobs and repeat all the same mistakes.


Florida City Pays $600,000 Ransom to Save Computer Records

Never, ever pay the ransom. There's no guarantee you get your files, and you become a target for others (because now they know you will pay). Also? Time to evaluate your security response plan regarding ransomware, especially if you're running older software. It's just a matter of time before Anton in Accounting clicks on that phishing link.


AMCA Files for Bankruptcy Following Data Breach

Nice reminder for everyone that the result of a breach is your company goes out of business. Life comes at you fast.


Machine Learning Doesn’t Introduce Unfairness—It Reveals It

Great post. Machine learning algorithms are not fair, because the data they use has inherent bias. And the machines are good at uncovering that bias. In some ways, we humans have built these code machines, and the result is we are looking at ourselves in the mirror.


Microsoft bans Slack and discourages AWS and Google Docs use internally

Because the free version of Slack doesn't meet Microsoft security standards. Maybe that should have been the headline instead of the clickbait trying to portray Microsoft as evil.


Cyberattack on Border Patrol subcontractor worse than previously reported

Your security is only as strong as your weakest vendor partner. Your security protocols could be the best in the world but it won't matter if you allow a partner access and they cause the breach.


Nashville is banning electric scooters after a man was killed

This is absurd. The scooters didn't do anything wrong. They should not be penalized for the actions of a drunk person making bad choices. I look forward to the mayor banning cars the next time a drunk driver kills someone in downtown Nashville.


Words can not describe the glory that is Data Grillen:


By Omar Rafik, SolarWinds Senior Manager, Federal Sales Engineering


Here’s an interesting article by my colleague Jim Hansen. Jim offers three suggestions to help with security, including using automation and baselines. I agree with Jim that one of the keys is to try to keep security in the background, and I’d add it’s also important to be vigilant and stick to it.


Government networks will continue to be an appealing target for both foreign and domestic adversaries. Network administrators must find ways to keep the wolves at bay while still providing uninterrupted and seamless access to those who need it. Here are three things they can do to help maintain this delicate balance.


1. Gain Visibility and Establish a Baseline


Agency network admins must assess how many devices are connected to their networks and who’s using those devices. This information can help establish visibility into the scope of activity taking place, allow teams to expose shadow IT resources, and root out unauthorized devices and users. Administrators may also wish to consider whether or not to allow a number of those devices to continue to operate.


Then, teams can gain a baseline understanding of what’s considered normal and monitor from there. They can set up alerts to notify them of unauthorized devices or suspicious network activity outside the realm of normal behavior.


All this monitoring can be done in the background, without interrupting user workflows. The only time users might get notified is if their device or activity is raising a red flag.


2. Automate Security Processes


Many network vulnerabilities are caused by human error or malicious insiders. Government networks comprise many different users, devices, and locations, and it can be difficult for administrators to detect when something as simple as a network configuration error occurs, particularly if they’re relying on manual network monitoring processes.


Administrators should create policies outlining approval levels and change management processes so network configuration changes aren’t made without approval and supporting documentation.


They can also employ an automated system running in the background to support these policies and track unauthorized or erroneous configuration changes. The system can scan for unauthorized or inconsistent configuration changes falling outside the norm. It can also look for non-compliant devices, failed backups, and even policy violations.


When a problem arises, the system can automatically correct the issue while the IT administrator surgically targets the problem. There’s no need to perform a large-scale network shutdown.


Automated and continuous monitoring for government IT can go well beyond configuration management, of course. Agencies can use automated systems to monitor user logs and events for compliance with agency security policies. They can also track user devices and automatically enforce device policies to help ensure no rogue devices are using the network.


Forensic data captured by the automated system can help trace the incident back to the source and directly address the problem. Through artificial intelligence and machine learning, the system can then use the data to learn about what happened and apply that knowledge to better mitigate future incidents.


3. Lock down security without compromising productivity


The systems and strategies outlined above can maintain network security for government agencies without interfering with workers’ productivity. Only when and if something comes up is a user affected, and even then, the response will likely be as unobtrusive as simply denying network access to that person.


In the past, that kind of environment has come with a cost. IT professionals have had to make a choice between providing users with unfettered access to the tools and information they need to work or tightening security to the point of restriction.


Fortunately, that approach is no longer necessary. Today, federal IT administrators can put security at the forefront by making it work for them in the background. They can let the workers work—and keep the hackers at bay.


Find the full article on GCN.


The SolarWinds trademarks, service marks, and logos are the exclusive property of SolarWinds Worldwide, LLC or its affiliates. All other trademarks are the property of their respective owners.

Are you excited for this post? I certainly know I am!


If this is the first article you're seeing of mine and you haven't read my prior article, "Why Businesses Don't Want Machine Learning or Artificial Intelligence," I advise you go back and read it as a primer for some of the things I'm about to share here. You might get “edumicated,” or you might laugh. Either way, welcome.


Well, officially hello returning readers. And welcome to those who found their way here due to the power of SEO and the media jumping all over any of the applicable buzzwords in here.


The future is now. We’re literally living in the future.


Tesla Self-Driving, look ma no hands!

Image: Tesla/YouTube


That's the word from the press rags if you've seen the video of Tesla running on full autopilot and doing a complicated series of commute/drives cited in the article, "New Video Shows Tesla's Self-Driving Technology at Work." And you would be right. Indeed, the future IS now. Well, kind of. I mean it's already a few seconds from when I wrote the last few words, so I'm clearly traveling through time...


But technology and novelty like driving in traffic are a little more nuanced than this.


"But I want this technology right now. Look, it works. Stop your naysaying. You hate Tesla, blah blah blah, and so on."


That feels very much like Apple/Android fanboy or fan-hate when someone says anything negative about a thing they want/like/love. Nobody wants this more than me. (Well, except for the robots ready to kill us.)


Are There Really Self-Driving Teslas?


You might be surprised to know that Tesla has advanced significantly in the past few years. I know, imagine that—so much evolution! But just as we reward our robot overlords for stopping at a stop sign, they're only as good as the information we feed them.


We can put the mistakes of 2016 behind us with tragedies like this: "Tesla self-driving car fails to detect truck in a fatal crash."


Fortunately, Tesla continues to improve and get better and we'll be ready to be fully autonomous with self-driving vehicles roaming the roads without flaw or problem by 2019. (Swirly lines, and flashback to just a few months into 2019: Tesla didn't fix an Autopilot problem for three years, and now another person is dead.)


Is the Tesla Autopilot Safe?


Around this time, as I was continuing my study, research, and analysis of this and problems like it, I came across the findings of @greentheonly.


Truck's are misunderstood and prefer to be identified as Overpasses



And rightly so, we can call this an anomaly. This doesn't happen that frequently. It's not a big deal, except for when it does happen. Not only just when, but the fact that it does happen… whether it's seeing the undercarriage of a truck and interpreting it as an overpass and thus you can "safely pass" under it, shearing the top off of the cab, or seeing a truck to the side of you and interprets the space beneath the truck as a safe “lane” to change into.


But hey, that's Autopilot. We're not going to use that anymore until it's solid, refined, and safe. Then the AI and ML can't kill me. I'll be making all the decisions.



ELDA Taking Control!ELDA may be the death of someone!



If you recall in the last article, I mentioned the correlation of robots, Jamba Juice, and plasma pumps. Do you ever wonder why Boston Dynamics doesn't have robot police officers like the ED-209 working on major metro streets, providing additional support akin to RoboCop? (I mean, other than the fact that they're barely allowed to use machine learning and computer vision. But I digress.)


It’s because they're not ready yet. They don't have things fully baked. They need a better handle on the number of “faults” that can occur, not to mention liability.


Is Autonomous Driving Safe?


Does this mean, though, that we should stop where we are and no one should use any kind of autonomous driving function? (Well, partially...) The truth is, there are, on average, 1.35 million road traffic deaths each year. Yes, that figure is worldwide, but that figure is also insanely staggering. If we had autonomous vehicles, we could greatly diminish the number of accidents we experience on the roads, which could bring those death tolls down significantly.


And someday, we will get there. The vehicles’ intelligence is getting better every day. They make mistakes, sometimes not so bad—"Oh, I didn't detect a goose on the road as an object/living thing and ended up running it over." Or, "The road was damaged in an area, so we didn't detect that was a changing lane/crossing lane/fill-in-the-blank of something else."


The greatest strength of autonomous vehicles like Tesla, Waymo, and others is their telemetry. But their greatest weakness is their reliance solely on some points of telemetry.


Can Self-Driving Cars Ever Be Safe?


In present-day 2019, we rely on vehicles with eight cameras (hey, that's more eyes than us humans have!), some LIDAR data, and a wealth of knowledge of what we should do in conditions on roadways. Some complaints I've shared with various engineers of some of these firms are the limitations of these characteristics, mainly that the cameras are fixed, unlike our eyes.





So, if we should encounter a rockslide, a landslide, something falling from above (tree, plane, meteorite, car-sized piece of mountain, etc.) we'll be at the will of the vehicle and its ability to identify and detect this. This won't be that big of a deal, though. We'll encounter a few deaths here or there, it'll make the press, and they'll quietly cover it up or claim to fix it in the next bugfix released over the air to the vehicles (unlike the aforementioned problem that went unsolved for three years).


The second-biggest problem we face is, just like us, these vehicles are (for the most part) working in a vacuum. A good and proper future of self-driving autonomy will involve the vehicles communicating with each other, street lights, traffic cameras, environmental data, and telemetry from towers, roads, and other sensors. Rerouting traffic around issues will become commonplace. When an ambulance is rushing someone to a hospital, it can clear the roadways in favor of emergency vehicles. Imagine if buses ran on the roads efficiently. The same could be true of vehicles.


That's not a 2020 vision. It’s maybe a 2035 or 2050 vision in some cities. But this is a future that can be well seen ahead of us.


The Future of Tesla and Self-Driving Vehicles


It may seem like I’m critical of Tesla and their Autopilot programs. That’s because I am. I see them jumping before they crawl. I've seen deaths rack up, and I've seen many VERY close calls. It's all in the effort of better learning and training. But it's on the backs of consumers and on the graves of the end users who've been made to believe these vehicles are tomorrow's self-drivers. In reality, they’re in an Alpha state with the sheer limited amount of telemetry available.


Will I use Autopilot? Yeah, probably... and definitely in an effort of discovering and troubleshooting problems because I'm the kind of geek who likes to understand things. I don't have a Tesla yet, but that's only a matter of time.


So, I obviously cannot tell you what to do, with your space-age vehicle driving you fast forward into the future, but I will advise you to be cautious. I've had to turn ELDA off on my Chevy Bolt as it has been steering me into traffic, and that effectively has little to nothing I would consider "intelligent" in the grand scheme of things.


At the start of this article, I asked if you were as excited as I was. I'm not going to ask if you're as terrified as I am! I will ask you to be cautious. Cautious as a driver, cautious as a road-warrior. The future is close, so let's see you live to see it with the rest of us. Because I look forward to a day where the number 10 cause of death worldwide is a thing of the past.


Thank you and be safe out there!

We put in all this work to create a fully customized toolchain tailored to your situation. Great! I bet you're happy you can change small bits and pieces of your toolchain when circumstances change, and the Periodic Table of DevOps tools has become a staple when looking for alternatives. You're probably still fighting the big vendors that want to offer you a one-size-fits-all solution.


But there's a high chance you're drowning in tool number 9,001. The tool sprawl, like VM sprawl after we started using virtualization 10-15 years ago, is real. In this post, we'll look at managing the tools managing your infrastructure.


A toolchain is supposed to make it easier for you to manage changes, automate workflows, and monitor your infrastructure, but isn't it just shifting the problem from one system to another? How do we prevent spending disproportionate amounts of time managing the toolchain, keeping us from more important value-add work?


The key here isn’t just to look at your infrastructure and workflows with a pair of LEAN glasses, but also to look at the toolchain with the same perspective. Properly integrating tools and selecting the right foundation makes all the difference.



One key aspect of managing tool sprawl is properly integrating all the tools. With each handover between tools, there's a chance of manual work, errors, and friction. But what do we mean by properly integrating? Simply put: no manual work between steps, and as little as possible custom code or scripting.


In other words, tools integrating natively with each other take less custom work. The integration comes ready out of the box, and are an integral, vendor-supported part of the product. This means you don't have to do (much) work to integrate it with different tools. Selecting tools that natively integrate is a great way to reduce the negative side effects of tool sprawl.


Fit for Purpose to Prevent DIY Solutions

Some automation solutions are very flexible and customizable, like Microsoft PowerShell. It's widely adopted, very flexible, highly customizable, and one of the most powerful tools in an admin's tool chest, but getting started with it can be daunting. This flexibility leads to some complexity, and often there's no single way of accomplishing things. This means you need to put in the work to make PowerShell do what you want, instead of a vendor providing workflow automation for you. Sometimes, it's worthwhile to use a fit-for-purpose automation tool (like Jenkins, SonarQube, or Terraform) that has automated the workflows for you instead of a great task automation scripting language. Developing and maintaining your own scripts for task automation and configuration management can easily take up a large chunk of your workweek, and some of that work has little to do with the automation effort you set out to accomplish in the first place. Outsourcing that responsibility to the company (or open-source project) that created the high-level automation logic (Terraform by HashiCorp is a great example of this) makes sense, saving your time for job-specific work adding actual value to your organization.


If you set out to use a generic scripting language, choose one that fits your technical stack (like PowerShell if you're running Microsoft stack) and technical pillar (such as Terraform for infrastructure-as-code automation).

Crisis Averted. Here's Another.

So, your sprawl crisis has been averted. Awesome; now we have the right balance of tools doing what they're supposed to do. But there's a new potential hazard around the corner: lock-in.


In the next and final post in this series, we'll have a look at the different forms of lock-in: vendor lock-in, competition lock-in, and commercial viability. 


Home from San Diego and Cisco Live! for a few days, just enough time to spend Father's Day with friends and family. Now I'm back on the road, heading to Lingen, Germany, for Data Grillen. That's right, an event dedicated to data, bratwurst, and beer. I may never come home.


As always, here are some links from the Intertubz that I hope will hold your interest. Enjoy!


This Austrian company is betting you’ll take a flying drone taxi. Really.

I *might* be willing to take one, but I’d like to know more about how they plan to control air traffic flow over city streets.


How Self-Driving Cars Could Disrupt the Airline Industry

Looks like everyone is going after airline passengers. If airlines don’t up their level of customer service, autonomous cars and flying taxis could put a dent in airline revenue.


Salesforce to Buy Tableau for $15.3 Billion in Analytics Push

First, Google bought Looker, then Salesforce buys Tableau. That leaves AWS looking for a partner as everyone tries to keep pace with Microsoft and PowerBI.


Genius Claims Google Stole Lyrics Embedded With Secret Morse Code

After decades of users stealing content from websites found with a Google search, Google decides to start stealing content for themselves.


Your Smartphone Is Spying (If You’re a Spanish Soccer Fan)

Something tells me this is likely not the only app that is spying on users. And while La Liga was doing so to catch pirated broadcasts, this is a situation where two wrongs don’t make a right.


Domino’s will start robot pizza deliveries in Houston this year

How much do you tip a robot?


An Amazing Job Interview

I love this approach, and I want you to consider using it as well, no matter which side of the interview you are sitting.


Sorry to all the other dads out there, but it's game over, and I've won:


By Omar Rafik, SolarWinds Senior Manager, Federal Sales Engineering


Here’s an interesting article by my colleague Brandon Shopp about choosing a database tool to help improve application performance and support advanced troubleshooting. Given the importance of databases to application performance, I’m surprised this topic doesn’t get more attention.


The database is at the heart of every application; when applications are having performance issues, there’s a good chance the database is somehow involved. The greater the depth of insight a federal IT pro has into its performance, the more opportunity to enhance application performance.


There’s a dramatic difference between simply collecting data and correlating and analyzing that data to get actionable results. For example, often, federal IT teams collect highly granular information that’s difficult to analyze; others may collect a vast amount of information, making correlation and analysis too time-consuming.


The key is to unlock a depth of information in the right context, so you can enhance database performance and, in turn, optimize application performance.


The following are examples of “non-negotiables” when collecting and analyzing database performance information.


Insight Across the Entire Environment


One of the most important factors in ensuring you’re collecting all necessary data is to choose a toolset providing visibility across all environments, from on-premises, to virtualized, to the cloud, and any combination thereof. No federal IT pro can completely understand or optimize database performance with only a subset of information.


Tuning and Indexing Data


Enhancing database performance often requires significant manual effort. That’s why it’s critical to find a tool that can tell you where to focus the federal IT team’s tuning and indexing efforts to optimize performance and simultaneously reduce manual processes.


Let’s take database query speed as an example. Slow SQL queries can easily result in slow application performance. A quality database performance analyzer presents federal IT pros with a single-pane-of-glass view of detailed query profile data across databases. This guides the team toward critical tuning data while reducing the time spent correlating information across systems. What about indexing? A good tool will identify and validate index opportunities by monitoring workload usage patterns, and then make recommendations regarding where a new index can help optimize inefficiencies.


Historical Baselining


Federal IT pros must be able to compare performance levels—for example, expected performance with abnormal performance—by establishing historic baselines of database performance to look at performance at the same time on the same day last week, and the week befor, etc.


This is the key to database anomaly detection. With this information, it’s easier to identify a slight variation—the smallest anomaly—before it becomes a larger problem.


Remember, every department, group, or function within your agency relies on a database in some way or another, particularly to drive application performance. Having a complete database performance tool enables agencies to stop finger-pointing and pivot from being reactive to being proactive; from having high-level information to having deeper, more efficient insight.


This level of insight can help optimize database performance and make users happier across the board.


Find the full article on our partner DLT’s blog Technically Speaking.


  The SolarWinds trademarks, service marks, and logos are the exclusive property of SolarWinds Worldwide, LLC or its affiliates. All other trademarks are the property of their respective owners.

This is a continuation of one of my previous blog posts, Which Infrastructure Monitoring Tools Are Right for You? On-Premises vs. SaaS, where I talked about the considerations and benefits of running your monitoring platform in either location. I touched on the two typical licensing models geared either towards a CapEx or an OpEx purchase. CapEx is something you buy and own, or a perpetual license. OpEx is usually a monthly spend that can vary month-to-month, like a rental license. Think of it like buying Microsoft Office off-the-shelf (who does that anymore?) where you own the product, or you buy Office 365 and pay month-by-month for the services that you consume.


Let’s Break This Down

As mentioned, there are the two purchasing options. But within those, there are different ways monitoring products are licensed, which could ultimately sway your decision on which product to choose. If your purchasing decisions are ruled by the powers that be, then cost is likely one of the top considerations when choosing a monitoring product.


So what license models are there?


Per Device Licensing

This is one of the most common operating models for a monitoring solution. What you need to look at closely, though, is the definition of a device. I have seen tools classify a network port on a switch as an individual device, and I have seen other tools classify the entire switch as a device. You can win big with this model if something like a VMware vCenter server or an IPMI management tool is classed as one device. You can effectively monitor all the devices managed by the management tool for the cost of a single device license.


Amount of Data Ingested

This model seems to apply more to SaaS-based security information and event management (SIEM) solutions, but may apply to other products. Essentially, you’re billed per GB of data ingested into the solution. The headline cost for 1GB of data may look great, but once you start pumping hundreds of GBs of data into the solution daily, the costs add up very quickly. If you can, evaluate log churn in terms of data generated before looking at a solution licensed like this.


Pay Your Money, Monitor Whatever You Like

Predominantly used for on-premises monitoring solutions deployments, you pay a flat fee and you can monitor as much or as little as you like. The initial buy-in price may be high, but if you compare it to the other licensing models, it may work out cheaper in the long run. I typically find established players in the monitoring solution game have on-premises and SaaS versions of their products. There will be a point where paying the upfront cost for the on-premises perpetual license is cheaper than paying for the monthly rental SaaS equivalent. It’s all about doing your homework on that one.



I may be preaching to the choir here, but I hope there are some useful snippets of info you can take away and apply to your next monitoring solution evaluation. As with any project, planning is key. The more info you have available to help with that planning, the more successful you will be.

Information Security artwork, modified from Pete Linforth at Pixabay.

In post #3 of this information security series, let's cover one of the essential components in an organization's defense strategy: their approach to patching systems.

Everywhere an Attack

When did you NOT see a ransomware attack or security breach story in the news? And when was weak patching not cited as one of the reasons making the attack possible? If only the organization had applied those security patches from a couple of years ago…


Baltimore City is one of the most recent ransomware attacks in the news. For the past month, RobbinHood ransomware, a variant of NSA's EternalBlue, crippled many of the city's online processes like email, paying water bills, and paying parking tickets. More than four weeks later, city IT workers are still working diligently to restore operations and services.


How could Baltimore have protected itself? Could it have applied those 2017 WannaCry exploit patches from Microsoft? Sure, but it's never just one series of patches that weren’t applied.


Important Questions

How often do you patch? How do you find out about vulnerabilities? What processes do you have in place for testing patches?


How and when an organization patches systems tells you a lot about how much they value security and their commitment to designing systems and processes for those routine maintenance tasks critical to an organization's overall security posture. Are they proactive or reactive? Patching (or lack thereof) can shine a light on systematic problems.


If an organization has taken the time to implement systems and processes for security patch management and applying security updates, the higher-visibility areas of their information security posture (like identity management, auditing, and change management) are likely also in order. If two-year-old patches weren't applied to public-facing web servers, you can only guess what other information security best practices have been neglected.


If you've ever worked in Ops, you know that a reprieve from patching systems and devices will probably never come. Applying patches and security updates is a lot like doing laundry. It's hard work and no matter how much you try, you’ll never catch up or finish. When you let it slide for a while, the catch-up process is more time-consuming and arduous than if you had stayed vigilant in the first place. However, unlike the risk of having to wear dirty clothes, the threat of not patching your systems is a serious security problem with real-world consequences.


Accountability-Driven Compliance

We all do better jobs when we’re accountable to someone, whether it be an outside organization or even yourself. If your mother-in-law continuously checked in to make sure you never fell behind on your laundry, you would be far more dutiful on keeping up with washing your clothes. If laundry is a constant challenge for you (like me), you would probably design a system to make sure you could keep up with laundry duties.


In the federal space, continuous monitoring is a tenet of government security initiatives like the Federal Risk and Authorization Program (FedRAMP) and Risk Management Framework (RMF). These initiatives centralize accountability and consequences while driving continuous patching in organizations. FedRAMP and RMF assess an organization's risk. Because unpatched systems are inherently risky, failure to comply can result in revoking an organization's Authority to Operate (ATO) or shutting down their network.


How can you tell if systems are vulnerable? Most likely, you run vulnerability scans that leverage Common Vulnerability and Exploits (CVE) entries. This CVE list, an "international cybersecurity community effort," drives most security vulnerability assessment tools in on-premises and off-premises products like Nessus and Amazon Inspector. In addition, software vendors leverage the CVE lists to develop security patches for their products. Vulnerability assessment and continuous scanning end up driving an organization's continuous patching stance.


Vendor Questions

FedRAMP also assesses security and accredits secure cloud offerings and services. This initiative allows federal organizations to tap into the "as a Service" world and let outside organizations and third-party vendors assume the security, operations, and maintenances of applications, of which patching and applying updates are an important component.

When selecting vendors or "as a service" providers, you could assess part of their commitment to security by inquiring about software component versions. How often do they issue security updates? Can you apply minor software updates out-of-cycle without affecting support?


If a software vendor's latest software release deployed a two-year-old version of Tomcat Web Server or a version is close to end-of-support with no planned upgrades, it may be wise to question their commitment to creating secure software.



The odds are that some entity will assess your organization's risk, whether it's a group of hackers looking for an easy target, your organization's security officer, or an insurance company deciding whether to issue a cyber-liability insurance policy. Here’s one key metric that will interest these entities: how many unpatched systems and vulnerabilities are lying around on your network and the story your organization's patching tells.

Did you come here hoping to read a summary of the past 5+ years of my research on self-driving, autonomous vehicles, Tesla, and TNC businesses?

Well, you’re in luck… that’s my next post.


This post helps make that post far more comprehensible. So here we lay the foundation, with fun, humor, and excitement. (Mild disclaimer for those who may have heard me speak on this topic before or listened to this story shared in talks and presentations. I hope you have fun all the same.)


I trust you’re all relatively smart and capable humans with a wide knowledge, depth, and breadth of the world, so for the following image, I want you to take a step back from your machine. In fact, I want you to look at it from as far as you possibly can, without putting a wall between you and the following image.




OK, have you looked at the image? Do you know what it is? If you're not sure, and perhaps have a child or aging grandparents, I encourage you to ask them.


Did you get hot dog? OK, cool. We’re all on the same page here.


Now as a mild disclaimer to my prior disclaimer—I’m familiar with the television series Silicon Valley. I don’t watch it, but I know they had a similar use-case in that environment. This is not that story, but just as when I present on computer vision being used for active facial recognition by the Chinese government to profile the Muslim minority in China and people say, "Oh, you mean like Black Mirror..." I mean “like that,” but I mean it's real and not just "TV magic."


Last year in April (April 28, 2018), this innocuous meat enclosure was run through the paces on our top four favorite cloud and AI/ML platforms, giving us all deep insight into how machines work, how they think, and what kinds of things we like to eat for lunch on a random day—the 4th day of July, perhaps. These are our findings.


I think we can comfortably say Google is the leader in machine learning and artificial intelligence. You can disagree with that, but you'd likely be wrong. Consider one of the leading machine learning platforms Tensor Flow (TF), which is an open-source project by Google. TF was recently released in 2.0 as an Alpha, so you might think "Yeah, immature product is more like it." But when you peel back that onion a bit and realize Google has been using it internally for 20 years to power search and other internal products, you might feel it more appropriate to call it version 25.0, but I digress.


What does Google think a hotdog is?



As you can see, if we ask Google, "Hey, what does that look like to you?" it seems pretty spot on. It doesn't come right out and directly say, "that’s a hot dog," but we can't exactly argue with any of its findings. (I'll let you guys argue over eating hot dogs for breakfast.) So, we're in good hands. Google is invited to MY 4th of July party!


But seriously, who cares what Google thinks? Who even uses Google anyway? According to the 2018 cloud figures, Microsoft is actually the leader in cloud revenue, so I'm not likely to even use Google. Instead, I'm a Microsoft-y through and through. I bleed Azure! So, forget Google. I only care about what my platform of choice believes in, because that's what I'll be using.


What Microsoft KNOWS a hotdog is!



Let’s peel back the chopped white onion and dig into the tags a bit. Microsoft has us pegged pretty heavily with its confidences, and I can't say I agree more. With 0.95 confidence, Microsoft can unequivocally and definitively say, without a doubt, I am certain:


This is a carrot.


We're done here. We don't need to say any more on this matter. And to think we Americans are only weeks away from celebrating the 4th of July. So, throw some carrots on the BBQ and get ready, because I know how I'll be celebrating!


Perhaps that's too much hyperbole. It’s also 0.94 confident that it's "hot" (which brings me back to my Paris Hilton days, and I'm sure those are memories you've all worked so hard to block out). But... 0.66 gives us relative certainty that if it wasn't just a carrot, it's a carrot stick. I mean, obviously it's a nicely cut, shaved, and cleaned-off carrot stick. Throw it on the BBQ!


OK, but just as we couldn't figure out what color “the dress” was, Microsoft knows what’s up, at 0.58 confidence. It's orange. Hey, it got something right. I mean, I figure if 0.58 were leveraged as a percentage instead of (1, 0, -1) it would be some kind of derivative of 0.42 and it would think it’s purple. (These are the jokes, people.)


But, if I leave you with nothing more than maybe, just MAYBE, if it's not a carrot and definitely NOT a carrot stick... it's probably bread, coming in at 0.41 confidence. (Which is cute and rings very true of The Neural Net Dreams of Sheep. I'm sure the same is true of bread, too. Even the machines know we want carbs.)


But who really cares about Microsoft, anyway? I mean, I'm an AWS junkie, so I use Amazon as MY platform of choice. Amazon was the platform of choice of many police departments to leverage its facial recognition technology to profile criminals, acts, and actions to do better policing and protect us better. (There are so many links on this kind of thing being used; picking just one was difficult. Feel free to read up on how police are trying to use this, and how some states and cities are trying to ban it.)


Obviously, Amazon is the only choice. But before we dig into that, I want to ask you a question. Do you like smoothies? I do. My favorite drink is the Açai Super Anti-Oxidant from Jamba Juice. So, my fellow THWACKers, I have a business opportunity for you here today.


Between you, me, and Amazon, I think we can build a business to put smoothie stores out of business, and we can do that through the power of robots, artificial intelligence, and machine learning powered by Amazon. Who's with me? Are you as excited as I am?


Amazon prefers you to DRINK your Hotdogs



The first thing I'll order will be a sweet caramel dessert smoothie. It will be an amazing confectionery that will put your local smoothie shop out of business by the end of the week.


At this point, you might be asking yourself, "OK, this has been hilarious, but wasn't the topic something about businesses? Did you bury the lede?”


So, I'll often ask this question, usually of engineers: do businesses want machine learning? And they'll often say yes. But the truth is really, WE want machine learning, but businesses want machine knowledge.


It may not seem so important in the context of something silly like a hot dog, but, when applied at scale and in diverse situations, things can get very grave. The default dystopian future or “realistic” future I lean towards is plasma pumps in a hospital. Imagine a fully machine controlled and AI-leveraged device like a plasma pump. It picks out your blood type based on your chart or perhaps it pricks your finger, figures out what blood it should give you, and starts to administer it. Not an unrealistic future, to be honest. Now, what if it was accurate 99.99999% of the time? Pretty awesome, right? But let's say it was as accurate as Google was with a hot dog, at 98% confidence. A business can hardly accept the liability that 99.9999999% might give. Drop that down to 98%, or the more realistic 90-95%, and that is literal deaths on our hands. 


Yes, businesses don't really want machine learning. They say they do because it's popular and buzzword-y, but when lives are potentially on the line, tiny mistakes come with a cost. And those mistakes add up, which can affect the confidence the market can tolerate. But hey, how much IS a life really worth? If you can give up your life to make a device, system, or database run by machine learning/AI better for the next person, versus, oh I don't know, maybe training your models better and a little longer—is it worth it?


There will come a point or many points where there are acceptable tolerances, and we'll see those tolerances and accept them for what they are at a certain point because, "It doesn't affect me. That's someone else's problem." Frankly, the accuracy of machine learning has skyrocketed in the past few years alone. That compounded with better, faster, smarter, and smaller TPUs (tensor processing units) means we truly are in a next-generation era in the kinds of things we can do. 


Google Edge TPU on a US Penny

Image: Google unveils tiny new AI chips for on-device machine learning - The Verge


Yes, indeed, the future will be here before we know it. But mistakes can come with dire costs, and business mistakes will cost in so many more ways because we "trust" businesses to do the better or the right thing.


"Ooh, ooh! You said four cloud providers. Where's IBM Watson?"


Yes, you're right. I wasn't going to forget IBM Watson. After all, they're the businessman's business cloud. No one ever got fired for buying IBM, blah blah blah, so on and so forth.


I always include this for good measure. No, not because it’s better than Google, Microsoft, or Amazon, but because of how funny it is.


IBM is like a real business cloud AI ML!



I’ll give IBM one thing—they're confident in the color of a hot dog, and really, what more can we ask for?


Hopefully, this was helpful, informative, and funny (I was really angling on funny). But seriously, you should have a better foundation for the constantly “learning” nature of the technology we work with. Consider that machines have a large knowledge base of information, and what they can learn in such a short time to make determinations about what a particular object is. Then also consider if you have young children. Send them into the kitchen and say, "Bring me a hot dog," and you're far more likely to get a hot dog back than you are to get a carrot stick (or a sweet caramel dessert). The point being, the machines that learn and the "artificial" intelligence we expect them to work with and operate from have the approximate intelligence of a 2- or 3-year-old.


They will get better, especially as we throw more resources at them (models, power, TPUs, CPU, and so forth) but they're not there yet. If you liked this, you'll equally enjoy the next article (or be terrified when we get to the actual subject matter).


Are you excited? I know I am. Let us know in the comments below what you thought. It's a nice day here in Portland, so I'm going to go enjoy a nice hot dog smoothie. 

This week's Actuator comes to you from sunny San Diego and Cisco Live! If you are here, and reading this, then you still have time to stop by Booth #1621 and say hello. I enjoy talking data with anyone, but especially with our customers.


As always, here are some links from the Intertubz that I hope will hold your interest. Enjoy!


Baltimore ransomware attack will cost the city over $18 million

I wonder how much it would have cost for the city to keep their systems up to date, and able to be patched in a timely manner.


Many iOS Developers Don’t Use Encryption: Report

So, Apple tries to force the use of encryption, but allows for developers to easily override. This is why we can't have secure things.


EU will force electric cars to emit a noise below 20 km/h on July 1

I would pay good money for my car to sound like George Jetson's.


Employees are almost as dangerous to business security as hackers and cybercriminals

Always a difficult conversation, when you want to enforce security standards and your colleagues take offense, as if you don't trust them. And yet, they are often the weakest link. Security is necessary to help good people from doing dumb things.


Uber’s Path of Destruction

A bit long, but worth every minute. Uber, along with other tech companies, are horrible for our economy. And they will remain so, until we elect legislators that understand not only technology, but the business finance of technology companies.


Microsoft and Oracle link up their clouds

The biggest shock in this article was discovering that Oracle has a cloud.


AWS launches Textract, machine learning for text and data extraction

Just a quick PSA here, but the code and model driving this service likely has inherent biases in a similar manner to issues with facial recognition tech. "Trust, but verify" should be the disclaimer pasted at the top of the page for this and every other AI-as-a-Service out there.



This is my first Cisco Live! and I think there's a chance I'm the only DBA here:

By Omar Rafik, SolarWinds Senior Manager, Federal Sales Engineering


Here’s an interesting article from my colleague Mav Turner. He offers a good overview of edge computing and offers suggestions on overcoming challenges. I like the idea of using technology like this to reduce latency.


Edge computing has become a critical enabler for successful cloud computing as well as the continued growth of the Internet of Things (IoT). Edge computing does come with its own unique challenges, but the advantages far outweigh the challenges.


What Is Edge Computing?

To understand why edge computing is so important, let’s first look at cloud computing for government agencies, which is fast becoming a staple of most agency environments.


One of the primary advantages of cloud computing is the ability to remove a vast amount of data and processing out of the agency—off-premises—so the federal IT pro no longer has to spend time, effort, and money maintaining it.


Logistically, this means agency data is traveling hundreds or thousands of miles back and forth between the end-user device and the cloud. Clearly, latency can become an issue. Enter edge computing, which essentially places micro data centers at the “edge” of the agency network. These micro data centers, or edge devices, serve as a series of distributed way stations between the agency network and the cloud.


With edge devices in place, computing and analytics power is still close to end-user devices—eliminating the latency issue—and the data that doesn’t have to be processed immediately is routed to a cloud data center at a later time. Latency problem solved.


Overcoming Edge Computing Challenges

The distributed nature of edge computing may seem to bring more complexity, more machines, greater management needs, and a larger attack surface.


Luckily, as computing technology has advanced, so has monitoring and visualization technology to help the federal IT pro realize the benefits of edge computing without additional management or monitoring pains.



Start by creating a strategy; this will help drive a successful implementation.


Be sure to include compliance and security details in the strategy, as well as configuration information. Create thorough documentation to standardize your hardware and software requirements.


Be sure patch management is part of the security strategy. This is an absolute requirement for ensuring a secure edge environment, as is an advanced security information and event management (SIEM) tool that will ensure compliance while mitigating potential threats.


Monitoring, Visualizing, and Troubleshooting

Equally important to managing edge systems is monitoring, visualization, and troubleshooting.


Monitoring all endpoints on the network will be a critical piece of successful edge computing management. Choose a tool that not only monitors remote systems, but provides automated discovery and mapping, so the federal IT pro has a complete understanding of all edge devices.


Additionally, be sure to invest in tools that provide full infrastructure visualization, so the federal IT pro can get a complete picture of the network at all times. Add in full network troubleshooting to be sure the team can monitor its entire infrastructure as well.


Creating a sound edge computing implementation strategy and using the right tools to monitor and manage the network will ease the pains and let the benefits of edge computing be fully realized.


Find the full article on Government Technology Insider.


The SolarWinds trademarks, service marks, and logos are the exclusive property of SolarWinds Worldwide, LLC or its affiliates. All other trademarks are the property of their respective owners.

If, like me, you believe we’re living in a hybrid IT world when it comes to workload placement, then it would make sense to also consider a hybrid approach when deploying IT infrastructure monitoring solutions. What I mean by this is deciding where to physically run a monitoring solution, be that on-premises or in the public cloud.


I work for a value-added reseller, predominantly involved with deploying infrastructure for virtualization, end-user computing, disaster recovery, and Office 365 solutions. During my time I’ve run a support team offering managed services for a range of different-sized customers from SMB to SME, and managing several different monitoring solutions. They range from SaaS products to bolster our managed service offering to customers’ locally installed monitoring applications. Each has its place and there are reasons for choosing one platform over another.


Licensing Models

Or what I really mean, how much does this thing cost?


Typically for an on-premises solution, there will be a one-off upfront cost and then a yearly maintenance plan. This is called perpetual licensing. The advantage here is that you own the product with the initial purchase, and then maintaining the product support and subscription entitles you to future upgrades and support. Great for a CapEx investment.


SaaS offerings tend to be a monthly cost. If you stop paying, you no longer have access to the solution. This is called subscription licensing. Subscription licensing is more flexible in that you can scale up or scale down the licenses required on a month-by-month basis. I will cover more on this in a future blog post.


Data Gravity

This is physically where the bulk of your application data resides, be that on-premises or in the public cloud. Some applications need to be close to their data so they can run quickly and effectively, and a monitoring solution should take this into consideration. It could be bad to have a SaaS-based monitoring solution if you have time-sensitive data running on-premises that needs near-instantaneous alerts. Consider time sensitivity and criticality when deciding what a monitoring solution should deliver for you. If being alerted about something up to 15 minutes after an event has occurred is acceptable, then a SaaS solution for on-premises workloads could be a good fit.


Infrastructure to Run a Monitoring Solution

Some on-premises monitoring solutions can be resource-intensive, which can prove problematic if the existing environment is not currently equipped to run it. This could lead to further CapEx expenditure for new hardware just to run the monitoring solution, which in turn is something else that will need to be monitored. SaaS, in this instance, could be a good fit.


Also, consider this: do you want to own and maintain another platform, or do you want a SaaS provider to worry about it? If we compare email solutions like Exchange and Exchange Online, Exchange is in many instances faster for user access and you have total control over how to run the solution. Exchange Online, however, is usually good enough and removes the hassle of managing an Exchange server. Ask yourself: do you need good enough or do you need some nerd knobs to turn?



A lot of this article may seem obvious, but the key to choosing a monitoring solution is to start out with a set of criteria that fits your needs and then evaluate the marketplace. Personally, I’ve started with lists with X, Y, and Z requirements with a weighting on each item. Whichever product scored the highest ended up being the best choice.


To give you an idea of those requirements, I typically start out with things like:

  • Is the solution multi-tenanted?
  • How much does it cost?
  • How is it licensed?
  • What else can it integrate with (like a SIEM solution, etc.)?


  A good end product is always the result of meticulous preparation and planning.

The bigger the project, the more risk there is. Any decent-sized project needs a professional project manager to run it and help mitigate that risk. When it comes to big projects, great project managers are worth their weight in gold. Over the years, I’ve worked with some horrible project managers and some fantastic project managers.


So, what makes a good project manager? A successful project manager will do several things to help keep the project on schedule, including:


  • Check in with team members
  • Collate regular feedback from the team members
  • Help team members escalate issues to the right people to resolve them
  • Coordinate meetings with non-project team members
  • Provide positive feedback to help drive the project forward
  • Serve as a liaison between IT and the business
  • Work with the project team on release night to see the project through to completion


One of the best experiences I had with a project manager was back when SQL Server 2005 was released in November 2005. We pitched to management to upgrade the loan origination system at the company from SQL Server 2000 to SQL Server 2005. The project manager was fantastic, bringing in development and business resources as needed to test the system prior to the upgrade.


The part that sticks out in my mind the most is the night of the upgrade. For other projects with different project managers, the managers would leave by 6 p.m. no matter what. For this release, the project manager came in later in the day with the rest of the folks on the project, and stayed until 4 a.m. while we did the release. When the team needed to order some dinner, our project manager made sure everyone got dinner (and that vegetarians got vegetarian food) and that anything we needed was taken care of. Even though this release went until 4 or 5 a.m. (I don’t remember exactly when we finished), this release went shockingly easy compared to others of which I’ve been a part.


On the flip side, I’ve had plenty of project releases that were the exact opposite experience. The project manager collects status updates every week from team members, and the weekly project status meetings are just basically regurgitating these updates back to the rest of the team. When it comes to release night, the project manager in many cases was nowhere to be found, and sometimes unavailable via email or cell phone. This made questions or problems with the release difficult if not impossible to solve.


When projects don’t have a good project manager guiding them to completion, it’s easy for the project to fail. On numerous occasions, I’ve seen IT team members attempt to step into the role of project manager. This means someone is filling this role on a part-time basis, which means the project management isn’t getting the full attention it deserves. Having someone not focused on the details means tasks are going to fall through the cracks and things that need to be done aren’t done (or they aren’t done in a timely fashion). All of this can lead to the project failing, sometimes in a rather spectacular or expensive way; and a failed project, for any reason, is never a good thing.

Hybrid IT merges private on-premises data centers with a public cloud vendor. These public cloud vendors have a global reach, near limitless resources, and drive everything through an API. Tapping into this new resource is creating some new technologies and new career paths. Every company today can run a business analytics system. This was never feasible in the past because these systems traditionally took months to deploy and required specialized analysts to maintain and run. Public cloud providers now offer these business analytics systems to anyone as a service and people can get started as easily as swiping their credit card.


We’re moving away from highly trained and specialized analysts. Instead, we require people to have a higher holistic view of these systems. Businesses aren’t looking for specialists in one area, they want people who can adapt and merge their foundation with cross-siloed skillsets. People are gravitating more and more towards skills in development. Marketing managers, for example, are learning SQL to analyze data to see if a campaign is working or not. People are seeking to learn skills that allow them to interact with the cloud through computer program languages because while the public cloud may act as a traditional data center, it's still a highly programmable data center.


The IT systems operator must follow this same trend and continue to add more software development and scripting abilities to their toolbelt. The new role of DevOps engineer has emerged because of hybrid IT. This role was originally created because of the siloed interaction between software developers, who are primarily concerned about code, and the systems operation team, who must support everything once it hits production. The DevOps role was created to help break the silos and create more communication between the two teams, but as time has passed, the two groups have merged together because they need to rely on each other. Developers showed operations how to run code effectively and efficiently, and SysOps folks showed developers how to monitor, support, and back up the systems on which their code runs.


A lot of new skills are emerging, but I want to focus on the three I feel are becoming the heavy favorites: infrastructure as code, CI/CD, and monitoring.


Infrastructure as Code (IaC)

According to this blog from Azure DevOps, "Infrastructure as Code is the management of infrastructure (networks, virtual machines, storage) in a descriptive model,” or a way to use code to deploy and manage everything in your environment. A lot of data centers share similar hardware from the same set of manufacturers, yet each is unique. That's because people configure their environments manually. If a change is required on a system, they make the change and complete the task. Nothing is documented, nobody knows what the change was, or how it was implemented. When the issue happens again, someone else repeats the same task or does it differently. As this goes on, our environments begin to morph and change and there's no resetting it, thus causing unique snowflakes.


A team that practices good IaC will always make their change in code, and for good reason. Once a task is captured in code, that code is a living document and can be shared and run by others for consistent deployment. The code can be checked into a CI/CD pipeline to test before it hits production and to version it for tracking purposes. As the list of tasks controlled by code gets larger, we can free up people's time by putting the code in a scheduler and start letting our computers manage the environment. This opens the doors for automation and allows people to work on meaningful projects.


Continuous Integration/Continuous Deployment (CI/CD)

When you're working in a team of DevOps engineers, you and your team are going to be writing a lot more code. You may be writing IaC code for deploying an S3 bucket, while your team member might be working on deploying an Amazon RDS instance, both writing in an isolated environment. The team needs a way to centrally store, manage, and test the code. Continuous integration helps merge those two different pieces into one unit, but the frequency of merging is very high. Continually merging and testing code helps shorten feedback loops and shorten the time it takes to find bugs in the code. Continuous delivery is the approach of taking what's in the CI code repo and continuously delivering and testing it in different environments. The end goal is to deliver a bug-free code base to our end customer, whether that customer is an internal employee or someone external paying for our services. The more frequently we can deliver our code to different environments like test or QA, the better our product will become.



We can never seem to get away from monitoring, but this makes sense given how important it is. Monitoring provides the insight we need into our environments. As we start to deploy more of our infrastructure and applications through code and deliver those resources quickly and through automation, we need a monitoring solution to alert us when there’s an issue. The resources we provide with IaC or ...manually... will be consumed and our monitoring solution will help tell us how far out those resources will be available or if we need to provide additional resources to prevent loss of service. Monitoring gives us historic trends of our environment, so we can compare how things are doing today to what they were doing yesterday. Did that new global load balancer help our traffic or hurt it? In today's security landscape, we need eyes everywhere, and having a monitoring solution can help notify if there's a brute force attack or if a single user is trying to log in from another part of the country.


Hybrid IT is a fast-growing market bringing lots of change to businesses as well as creating new careers and new skillsets. From hybrid IT, new careers are emerging because of the need to merge different skillsets from traditional IT, development, and business units. This combination is stretching people’s knowledge. The most successful people are merging what they know about traditional infrastructure with the new offerings from the public cloud.

THWACKcamp is Back! - YouTube


Here at SolarWinds, convention season is just beginning to heat up. Whether you’re lucky enough to travel to these shows or are just following our exploits online, you’ll see us across the globe—from London (Info Security Europe) to New York (SWUG), Vegas (Black Hat) San Diego (Cisco Live!), San Francisco (VMworld, Oracle World), and Singapore (RSA)—demoing and discussing the best monitoring features, whether they’re brand-new or just new-to-you.


But there’s one conference that, for us, is circled in red marker on our calendar: THWACKcamp 2019, which is so very happening October 16 – 17, 2019.


Running the Numbers

Now in its 8th year, THWACKcamp has grown in both quality and quantity each year. Last year, we saw more than 2,300 people attend, consuming 22 hours of content accompanied by real-time discussions in live chat. On top of that, people kept coming back for more, and viewed the recordings of those same sessions over 16,000 more times after THWACKcamp 2018 ended.


This coming year promises to be our most ambitious one yet—and not just because we expect more attendees, more content, or more amazing giveaway prizes.


Brand New Formula, Same Great Taste!

First, I want to talk about the things that AREN’T changing.


THWACKcamp 2019 is still 100% free and 100% online. That means you don’t have to beg your boss for budget, risk one of TSA’s “very personal” pat-down procedures, fight flocks of other IT pros thronging to the next session, or deal with less-than-optimal hotel options.


The event is still going to be two full days packed with content and live segments. A legion of SolarWinds folks ranging from Head Geeks to engineers to product managers will be on chat to field questions, offer insights, and take conversations offline if needed.


And of course, we’re still going to have some awesome prizes to give away throughout the event. (Look, we know you come for the information, but we also know the prizes add a whole ‘nother level of fun to it and we’re not about to give that up either. We have as much fun brainstorming what cool swag to give away as you do winning it. That said, this thread on will let you offer your ideas on what you’d like to see us give away:


SO... what about the “new and improved” part of THWACKcamp?

The thing you’ll notice most is every session is going to take the time it needs, rather than conforming to a standard 30- to 40-minute window. This allows us to intersperse deep-dive topics with quick 10-minute how-tos, and even a few funny “commercials” to make sure you’re paying attention.


The second thing you’ll notice is that we’ve broken out of the studio. We love our set and still have a bunch of the sessions there, but you’ll also see us in discussions in lounge areas, outside, and maybe even on-location at events. IT professionals are rarely “at rest” and THWACKcamp reflects that this year too.


Both of those elements allowed us to make one other big improvement: a single track of sessions each day. We’ll be able to cover more topics and ensure that everyone is in the right “room” at the right time to hear all the THWACKcamp-y goodness.


And the last thing you’ll notice is how THWACKcamp will be even more interactive than ever. During the sessions, we’re adding an interactive question-and-answer system called If you’ve attended a SWUG (and if you haven’t, you really should!, you know exactly how this works. We’ll use it to get your feedback in real-time, find out how many people prefer one feature over another, and you’ll be able to post your own questions for our staff to answer, where it won’t get lost in the mad banter of the live chat window. Speaking of chat, it’ll still be there, in a THWACKcamp “watercooler” section, where you can talk about your experiences with SolarWinds modules, ask for tips on configuring ACLs, or debate the supremacy of the MCU vs. DCU.


Take My Money!

By this point, I hope you’re shouting at your screen “BUT LEON, HOW DO I SIGN UP???” If this is you, maybe have the barista bring you a decaf on the next round. And while you’re waiting, head over to the THWACKcamp Registration page THWACKcamp 2019. You’ll be able to sign yourself up for the event and see the full schedule of sessions.


But you’ll also gain valuable insight and information between now and October 16. We’ll be sharing exclusive blog posts, videos, and even behind-the-scenes images to give you insight into how an event of this magnitude goes together, and prepare you to get the most out of the THWACKcamp experience.


Also exclusive to folks who register will be “Ask Me (Almost) Anything” sessions. After you complete your registration, you’ll get access to that same system I mentioned earlier. Once again, using, you will have an opportunity to submit questions or upvote questions from other folks. We’ll host five live on-camera sessions between now and October 16 to answer those questions for you. But remember, you only get access to that after you register.


So, what are you waiting for? Go register now: THWACKcamp 2019! And don’t be selfish, either. Share that link with coworkers, colleagues, and friends in IT who may be thinking “No way am I going to make it to a conference this year.” Sure you are. Here at SolarWinds we’ve got a solution for you, just like we do for so many of your IT challenges.

This week's Actuator comes to you from June, where the weather has turned for the better after what seems like an endless amount of rain. We were able to work in the yard and it reminded me that one of the best ways to reduce stress is some physical exercise. So wherever you are, get moving, even if it is a walk around the block.


As always, here are some links from the Intertubz that I hope will hold your interest. Enjoy!


The Vehicle of the Future Has Two Wheels, Handlebars, and Is a Bike

Regular readers of the Actuator know I have a fondness for autonomous vehicles. This article made me rethink that our future may best be served with bicycles, at least in more urban areas.


SpaceX Starlink satellites dazzle but pose big questions for astronomers

Is there any group of people Elon Musk hasn’t upset at this point?


How much does it cost to get an employee to steal workplace data? About $300

And for the low price of $1,200, you can get them to steal all the data. This is why security is hard: because humans are involved.


Real estate title insurance company exposed 885,000,000 customers' records, going back 16 years: bank statements, drivers' licenses, SSNs, and tax records

Setting aside the nature of the data, much of which I believe is public record, I want you to understand how the breach happened: because of lousy code. Until we hold individuals, not just companies, responsible for avoiding common security practices, we will continue to suffer data breaches.


Bad metadata means billions in unpaid royalties from streaming music services

Each paragraph I read made me a little sadder than the one before.


Artificial Intelligence Isn’t Just About Cutting Costs. It’s Also About Growth.

Let the machines do the tasks for which they are able, freeing up humans to do tasks for which they are able. Yes, automation and AI is about growth, and about efficiency (i.e., cutting costs).


What 10,000 Steps Will Really Get You

I never knew why Fitbit chose 10,000 steps as a default goal, but this may explain why. I’m a huge advocate for finding ways to get extra steps into your day. I hope this article gets you thinking about how to get more steps for yourself each day.


Pictured here is six yards of gravel. Not pictured is another 12 yards of topsoil. All moved by hand. I need a nap.


By Omar Rafik, SolarWinds Senior Manager, Federal Sales Engineering


Here’s an interesting article written by my colleague Jim Hansen. It seems that our BYO challenges are not over, and Jim offers some great steps agencies can take to help with these issues.


In 2017, the Department of Defense (DoD) released a policy memo stating that DoD personnel—as well as contractors and visitors to DoD facilities—may no longer carry mobile devices in areas specifically designated for “processing, handling, or discussion of classified information.”


For federal IT pros, managing and securing “allowable” personal and government devices is already a challenge. Factor in the additional restrictions and the real possibility that not everyone will follow the rules, mobile-device management and security can seem even more overwhelming.


Luckily, there are steps federal IT pros can take to help get a better handle on managing this seemingly unmanageable Bring Your Own Everything (BYOx) environment, starting with policy creation and implementation, and including software choices and strategic network segmentation.


Agency BYOx Challenges


Some agencies allow personnel to use their own devices, some do not. For those that do, the main challenges tend to be access issues: which devices are allowed to access the government network? Which devices are not?


For agencies that don’t, there’s the added challenge of preventing unauthorized use by devices that “sneak through” security checkpoints.


Implementing some of the below best practices to support your government cybersecurity solutions can help ensure complete protection against a BYOx threat.


Three-Step BYOx Security Plan


Step One: Train and Test


Most agencies have mobile device management policies, but not every agency requires personnel to take training and pass a policy-based exam. Training can be far more effective if agency personnel are tested on how they would respond in certain scenarios.


Effective training emphasizes the importance of policies and their consequences. What actions will personnel face if they don’t comply or blatantly break the rules? In the testing phase, be sure to include scenarios to help solidify personnel understanding of what to do when the solution may not be completely obvious.


Step Two: Access Control


Identity-based access management is used to ensure only authorized personnel are able to access the agency network using only authorized devices. Add a level of security to this by choosing a solution that requires two-factor authentication.


Additionally, be sure to create, maintain, and carefully monitor access-control lists to help ensure that users have access to only the networks and resources they need to do their jobs. When establishing these access control lists, include as much information as possible about the users and resources—systems and applications—they are allowed to access. A detailed list could aid in discovering and thwarting fraudulent access from a non-authorized device.


Step Three: Implement the Right Tools


Mobile phones are far and away today’s biggest BYOx issue for federal IT pros. As a result, access control (step two) is of critical importance. That said, ensuring the following basic security-focused tasks are being implemented is a critical piece of the larger security picture:


• Patch management – Patch management is a simple and effective security measure. Choose a product that provides automated patch management to make things even easier and keep your personnel’s devices patched, up to date, and free of vulnerabilities and misconfigurations.


• Threat detection – Users often have no idea their devices have been infected, so it’s up to the federal IT pro to be sure a threat detection system is in place to help ensure that compromised devices don’t gain access to agency networks.


• Device management – If a user tries to attach an unauthorized device to the network, the quicker the federal IT pro can detect and shut down access, the quicker a potential breach is mitigated.


Access rights management – Provisioning personnel, deprovisioning personnel, and knowing and managing their access to the critical systems and applications across the agency is necessary to help ensure the right access to resources is granted to the right people.




Sticking to the basics and implementing a logical series of IT and end user-based solutions can help reduce the risk of mobile technologies.


Find the full article on our partner DLT’s blog Technically Speaking.


The SolarWinds trademarks, service marks, and logos are the exclusive property of SolarWinds Worldwide, LLC or its affiliates. All other trademarks are the property of their respective owners.

Security chain link


In the second post in this information security in a hybrid IT world series, let’s cover the best-designed security controls and measures, which are no match for the human element.


“Most people don’t come to work to do a bad job” is a sentiment with which most people will agree. So, how and why do well-meaning people sometimes end up injecting risk into an organization’s hardened security posture?


Maybe your first answer would be falling victim to social engineering tricks like phishing. However, there’s a more significant risk: unintentional negligence in the form of circumventing existing security guidelines or not applying established best practices. If you’ve ever had to troubleshoot blocked traffic or user who can’t access a file share, you know that one quick fix is to disable the firewall or give the user access to everything. It’s easy to tell yourself you’ll revisit the issue later and re-enable that firewall or tighten down those share permissions. Later will probably never come, and you’ve inadvertently loosened some of the security controls.


It’s easy to blame the administrator who made what appears to be a short-sighted decision. However, human nature prompts us to take these shortcuts. In our days on the savannah, our survival depended on taking shortcuts to conserve physical and mental energy to get through the harsh times on the horizon. Especially on short-staffed or overwhelmed teams, you save energy in the form of shortcuts that let you move on to the next fire. For as many security issues that may exist on-premises, “62% of IT decision makers in large enterprises said that their on-premises security is stronger than cloud security,” according to Dimensional Research, 2018.The stakes are even higher when data and workloads move to the cloud, where your data exploits can have further reach.


In 2017, one of the largest U.S. defense contractors was caught storing unencrypted application credentials and sensitive data related to a military project on a public, unprotected AWS S3 instance. The number of organizations caught storing sensitive data in unprotected, public S3 instances continues to grow. However, dealing with the complexity of securing data in the cloud requires other tools for improving the security posture and helping to combat the human element in SaaS and cloud offerings: Cloud Access Security Brokers (CASBs).


Gartner defines CASBs as “on-premises, or cloud-based security policy enforcement points, placed between cloud service consumers and cloud service providers to combine and interject enterprise security policies as the cloud-based resources are accessed.” By leveraging machine learning, CASBs can aggregate and analyze user traffic and actions across a myriad of cloud-based applications to provide visibility, threat protection, data security, and compliance in the cloud. Also, CASBs can handle authentication/authorization with SSO and credential mapping, as well as masking sensitive data with tokenization.


Nifty security solutions aside, the best security tools for on-premises and off-premises are infinitely more effective when the people in your organization get behind the whole mission of what you are trying to accomplish.


Continuing user education and training is excellent. However, culture matters. Environments in which people feel they have a role in information security increase an organization’s security posture. What do you think are some of the best ways to change an organization’s culture when it comes to security?

We've established that choosing a one-size-fits-all solution won't work. So, what does work? Let's look at why we need tools in the first place, and what kind of work these tools take off our hands.


Two Types of Work

Looking at the kinds of work IT people do, there are two broad buckets. First, there's new work: the kind of work that has tangible value, either for the end customer or for the IT systems you operate and manage. This is the creative work IT people to do create new software, new automation, or new infrastructure. It’s work coming from the fingers of a craftsmen. It takes knowledge, experience, and creativity to create something novel and new. It's important to realize that this type of proactive work is impossible to automate. Consider the parallel to the manual labor artists and designers do to create something new; it's just not something a computer can generate or do for us.


Second, there's re-work and toil. These kinds of reactive work are unwanted. Re-work needs to be done to correct quality issues in work done earlier, like fixing bugs in software, improving faulty automation code, and mitigating and resolving incidents on production. This also includes customer support work after incidents and fixing technical debt due to bad decisions in the past, or badly managing the software lifecycle. This leads to technical debt, outdated software, or systems and architectures that haven't been adapted to new ways of work, scalability, or performance requirements. For IT ops, physical machines, snowflake virtual machines, and on-premises productivity systems (like email, document management, or collaboration tools) are good examples.


How Do Tools Fit In?

Now that we understand the types of work we do, we can see where automation tools come in. They take away re-work and toil. A well-designed toolchain frees up software and infrastructure engineers to spend more time on net-new work, like new projects, new features, or improvements to architecture, systems, and automation. In other words: the more time you spend improving your systems, the better they'll get. Tools help you break the cycle of spending too much time fixing things that broke and not preventing incidents in the first place. Automation tooling helps remove time spent on repetitive tasks that go through the same process each time.


By automating, you're creating a representation of the process in code, which leads to consistent results each time. It lowers the variation of going through a process manually with checklists, which invariably leads to a slightly different process with unpredictable outcomes. It's easy to improve the automation code each time, which lowers the amount of re-work and faults each time you improve the code. See how automating breaks the vicious circle? Instead, the circle goes up and up and up with each improvement.


A proper toolchain increases engineering productivity, which in turn leads to more, better, and quicker improvements, a lower failure rate of those improvements, and a quicker time to resolving any issues.


How Do I Know If Work Is a Candidate for Automation?

With Value Stream Mapping, a LEAN methodology. This is a way of visualizing the flow of work through a process from start to finish. Advanced mappings include red and green labels for each step, identifying customer value, much like the new work and re-work we talked about earlier. Good candidates include anything that follows a fixed process or can be expressed as code.


It's easy to do a VSM yourself. Start with a large horizontal piece of paper or Post-It notes on a wall, and write down all the steps chronologically. Put them from left to right. Add context to each step, labeling each with green for new work or red for toil. If you're on a roll, you can even add lead time and takt time to visualize bottlenecks in time.


See a bunch of red steps close to each other? Those are prime candidates for automation.


Some examples are:

  1. If a piece of software is always tested for security vulnerabilities
  2. If you make changes to your infrastructure
  3. If you test and release a piece of new software using a fixed process
  4. If you create a new user using a manual checklist
  5. If you have a list of standard changes that can go to production after checking with the requirements of the standard change


But What Tools Do I Choose?

While the market for automation tooling has exploded immensely, there's some great resources to help you see the trees through the forest.

  1. First and foremost: keep it simple. If you use a Microsoft stack, use Microsoft tools for automation. Use the tool closest to the thing you're automating. Stay within your ecosystem as a starting point. Don't worry about a tool that encompasses all the technology stacks you have.
  2. Look at overviews like the Periodic Table of DevOps Tools.
  3. Look at what the popular kids are doing. They're usually doing it for a reason, and tooling tends to come in generations. Configuration management from three generations ago is completely different than modern infrastructure-as-code tools, even if they do the same basic thing.


Next Up

Happy hunting for the tools in your toolchain! In the next post, I'll discuss a problem many practitioners have after their first couple of successful automation projects: tool sprawl. How do you manage the tools that manage your infrastructure? Did we just shift the problem from one system to another? A toolchain is supposed to simplify your work, not make it more complex. How do you stay productive and not be overloaded with the management of the toolchain itself? We'll look at integrating the different tools to keep the toolchain agile as well as balancing the number of tools.

Filter Blog

By date: By tag:

SolarWinds uses cookies on its websites to make your online experience easier and better. By using our website, you consent to our use of cookies. For more information on cookies, see our cookie policy.