1 2 3 Previous Next

Geek Speak

2,779 posts

Building IT software isn’t always the most secure process. The reason for this is simple economics. Companies can’t always afford to build in the security features software needs to be secure.


Let’s walk through a typical IT software project.


As the IT software project is planned out, the security of the software and the data the software contains is accounted for. But after the initial design of the software comes budget planning. There’s almost a 100% chance that the budget requested by the developers isn’t going to be approved by management. Management typically approves somewhere from 50-80% of the requested budget. This means features need to be cut. The business unit that requested the project will want to keep most if not all of the features they requested. That means the development team is going to need to find somewhere else to cut. Something has to give. In typical cases, the best options for securing the data and various security tests are going to be cut. This means that data that was going to be securely stored encrypted will likely now be stored in plain text. Security testing on the software isn’t going to be done at all, or if it is, it’ll be scaled way back.


While these types of cuts are not uncommon, as IT leaders, we need to make the business case for investing in enhanced security. We need to demonstrate that budget cuts in security end up leading to software that’s less secure than end users deserve. From a business perspective, this leaves the company open to potential data breach issues and the remedies that the states and countries the software is operated in are subject to. In the United States, if the software customers are in California or Massachusetts, there are some data protection laws in place that cover data breaches and data encryption.


The issue with data breaches is that you can’t fix the cause of the problem after the fact. Once customer data has been released to unauthorized parties, it doesn’t matter how much money the company spends or what they do to improve the software to ensure a breach doesn’t happen again. At this point, it’s too late—the customer's data has already been breached, and it’s in the hands of people that shouldn’t have the data. There’s no getting the data back out of the public eye. Once it has been released, there’s simply no putting the genie back in the bottle.


As IT professionals, we need to be building software that isn’t easily breached so customer data isn’t released. The fact that in recent years we’ve heard about problems like databases having blank passwords with millions of customers information sitting in them or files sitting in cloud services with no security is just inexcusable.


While budget will always be a major consideration, security also needs to be a driving factor as we consider software development. We shouldn’t have databases without passwords—it doesn’t matter that the default is no password. We shouldn’t have cloud-based file shares with millions of customer records sitting with no security. Once these breaches happen, there’s no getting the data back.


We have to build more secure platforms and software that don’t have simple, easy-to-correct issues in their configuration. The more we can ingrain this thinking into our organizations, the better off we all will be.

Over the past few weeks, I've written a lot about hybrid IT and its benefits. New tools, new technologies, new skillsets. Hybrid IT is still new to a lot of companies, and you might be wondering how to get started. Below are some areas to look into as you think about your own hybrid IT journey. There are a lot of traditional services you can transfer to the public cloud as a starting point.


Backups and Archive

Backups are a great way for companies to dip their toes into the hybrid IT pool. Tapes are cheap and hold a ton of information...but they're tapes. They can fail from the media becoming corrupt and need to be tested regularly to confirm their reliability. Rotating tapes offsite is a good practice, but takes discipline and good organizational skills to keep track of the many cataloged tapes. Transferring tapes pose a risk during the window of transit. During transfer, a tape could be damaged or stolen. Managing backups can be a real nightmare with its decentralized practice. Companies might have multiple remote sites small enough to not warrant a system's operator, and instead require an office manager to switch out tapes. Time to restore data can be a lengthy process as technicians need to retrieve tapes from offsite locations, re-catalog the data, and determine if the correct tape was retrieved.


Instead, look at using the public cloud. Most backup software vendors today recognize an object store as a data target. Configure the software to send test data to the public cloud and start experimenting. See how you can exchange tape rotation for data life cycle management. Instead of physically moving tapes around, in the cloud, set policies to automatically move data to a lower storage tier after so many days to help save money. Move the data to its lowest tier for archiving where data can be recovered in minutes or hours, depending on the cloud provider. Data at rest can be encrypted with your keys so the only one that can read the data is your company. No more having to transfer tapes to an offsite location and worry about theft or damaged media. The public cloud allows you to send to not just one region, but to multiple regions. If you're worried the U.S. west coast will go down, send your data to a different continent. Stop worrying and managing physical media. Reduce your administration work to focus on areas helpful to the business.


Infrastructure as Code

So many vendors today provide their APIs to allow for custom configurations. Being able to hit an API or use a software development kit (SDK) speeds up the provisioning process and provides a living, breathing, shareable document. I'm all about reducing the learning curve. It's much easier for me to understand new technology by mapping it back to something I already know and recognize the differences from there. Learning a new language like infrastructure as code (IaC) and learning how infrastructure in the public cloud works at the same time draws out the learning curve. Instead, focus strictly on IaC.


IaC is a great flexible tool that can be used to provision workloads from a number of providers, including on-premises data centers. Instead of using native tools built inside your data center to deploy workloads, use IaC to deploy them. Understand the syntax and how it abstracts the underlying infrastructure. Get a feel for the shift from manually managing workloads to managing code as infrastructure. Understand the pros and cons. Learning one new technology and understanding how it works using traditional infrastructure helps pave the path to when the organization is ready to consume public cloud resources.


Start Small

This might be obvious, but it's a good reminder. Start small. Don't feel like you have to pick the hardest problem to fix with the public cloud. Use a new project as a potential starting point. Instead of reviewing traditional data center technologies, cast a wider net and invite some vendors who only live in the cloud. Understand the differences and compare them to the needs of the business. When maintenance is coming due on a piece of equipment, think outside the box and question if this is something you need to continue to maintain. Is this piece of software or hardware critical to my business and can I offload the responsibility to the public cloud?


The public cloud makes it easy to transfer and grow a lot of traditional infrastructure technologies. Once people get a taste and see how easy it is to scale and manage, they rush in. They find out managing cloud infrastructure is different than traditional infrastructure. Not only is the cost structure different, but not having to access the lower layers of the stack makes people nervous and requires a lot of trust. To gain trust, move mundane tasks not adding a lot of value to the business. Control what initially goes into the public cloud to better understand how the cloud functions and know the cost structure around their services. Reduce the learning curve and add on pieces with which you're unfamiliar.


Public cloud and private data centers are on a collision course. It's better to start now and grow familiarity and trust before you're forced.

Had a great week in Austin last week, even managed to play some touch rugby with members of Team USA. After a short turnaround at home, I'm in Las Vegas this week for Microsoft Inspire. If you're attending MS Inspire, stop by the booth. You know I'd love to talk data with you.


As always, here are some links I hope you find interesting. Enjoy!


Experiments show dramatic increase in solar cell output

This is an example of a problem that would benefit from advances in quantum computing, as it's difficult to build the research models and simulations necessary with classical computers.


Kindle and Nook readers: You know you don’t own those books, right?

SPOILER ALERT: You don't own your music, either.


There’s a Security Incident in the Cloud: Who’s Responsible?

Good reminder about the need to be clear with the duties, roles, and responsibilities between your office and your cloud service provider. My take is that security is a shared responsibility, and it requires constant conversations as new threats emerge at an accelerated rate these days.


Facebook’s $5 billion FTC fine is an embarrassing joke

Fines should serve as a penalty to a company, not a reward.


No limit: AI poker bot is first to beat professionals at multiplayer game

Well, we've taught the machines how to play games, ones that allow them to earn a living, too. Maybe we could spend some time on things like curing diseases and less on things like predicting stocks.


Zoom Mac flaw allows webcams to be hijacked - because they wanted to save you a click

Secure. Open. Convenient. Pick two.


Google workers can listen to what people say to its AI home devices

At first this story seems horrible, just another example of our trust betrayed by a software giant. But I see it as an acceptable use of customer data, stripped of any personally identifiable information, to make their product better. Now, if users aren't reading their agreements, and don't know what is happening, well... security is a shared responsibility, folks.


When you get a chance to have a run with members of Team USA, you do it, regardless if it is 98F in the shade at 7 p.m. in Austin:



FWIW, I'm on vacation next week. The Actuator will return in two weeks, providing I get plenty of rest, bourbon, and bacon. Wish me luck!

By Omar Rafik, SolarWinds Senior Manager, Federal Sales Engineering


Here’s an interesting article by my colleague Jim Hansen about the role of SIEM tools. SIEM products can offer a powerful defense and our SIEM tool has always been one of my favorite SolarWinds products.


While there is no one single solution to guard agencies against all cyberthreats, there are tools that can certainly go a long way toward managing and understanding the cyberthreat landscape. One such tool is Security Information and Event Management (SIEM) software. SIEM tools combine Security Information Management (SIM) with Security Event Management (SEM) capabilities into a single solution with the intent of delivering comprehensive threat detection, incident response, and compliance reporting capabilities.


SIEM tools work by collecting information from event logs from most (if not all) agency devices, from servers and firewalls to antimalware and spam filters. The software then analyzes these logs, identifies anomalous activity, and issues an alert—or, in many cases, responds automatically.


While log data comes from many locations, SIEM software consolidates and analyzes this data as a whole; the federal IT pro can then view all the data from a single dashboard. A single, unified view can help identify trends, easily spot unusual activity, and help establish a proactive (vs. reactive) response.


Choosing a SIEM Tool


There are a wide variety of SIEM tools available today, each offering its own advantages. SIEM tools can offer everything from big data analytics to centralized forensic visibility to artificial intelligence-driven behavior analytics. It can be a challenge to choose the tool that fits agency requirements.


• Does the SIEM provide enough native support for all relevant log sources? Be sure the chosen toolset matches well with the types of devices from which it will be collecting and analyzing information.


• If the SIEM doesn’t have native support for a relevant log source, how quickly and easily can it be created, and can it support custom log sources for applications the agency has developed in-house?


• Reducing the time to detection (TTD) is critical to prevent exposure, data loss, and compromise. Choose a SIEM tool with the ability to provide advanced analysis quickly, with little security team intervention.


• Does the SIEM include useful, relevant, and easy-to-use out-of-the-box reports? The value in a single-pane-of-glass approach provided through SIEM software is the ability to see one report or one chart that encompasses a vast amount of data. Be sure the agency’s chosen tool provides templates that can be easily implemented and just as easily customized where necessary.


• Does the SIEM make it easy to explore the log data and generate custom reports? Choose a tool that simplifies the data exploration and reporting function to help you get answers quickly and with minimal effort.




The bad guys continue to get smarter, are well funded, and know most federal agencies aren’t funded well enough to thwart their continuously changing tactics. As the world becomes more interconnected and complex, and as cloud and Internet of Things (IoT) devices become part of the federal landscape, federal agencies need to be thoughtful and smart about how they combat the threats actively targeting them.


A SIEM tool can dramatically ease the burden of every federal IT pro, saving valuable time and providing an additional security checkpoint across the agency’s systems.


Find the full article on Government Technology Insider.


The SolarWinds trademarks, service marks, and logos are the exclusive property of SolarWinds Worldwide, LLC or its affiliates. All other trademarks are the property of their respective owners.

Labyrinth or Layered approach

Ladies and gentlemen, we’ve reached the fifth and final post of this information security in hybrid IT series. I hope you’ve found as much value in these posts as I have in your thoughtful comments. Thanks for following along.


Let’s take a quick look back at the previous posts.


Post #1: When "Trust but Verify" Isn’t Enough: Life in a Zero Trust World

Post # 2: The Weakest (Security) Link Might Be You

Post #3: The Story Patching Tells

Post #4: Logs, Logs, and More Logs


Throughout the series, we’ve covered topics vital to an organization’s overall security posture, including zero trust, people, patching, and logs. These are a pivotal part of the people, process, and technology model vital to an organization’s Defense in Depth security strategy.


What Is Defense in Depth?

Originally a military term, Defense in Depth, also known as layered security, operates from the premise that if one layer of your defense is compromised, another layer is still in place to thwart would-be attackers. These preventative controls typically fall into one of the following categories.


  • Technical controls use hardware or software to protect assets. Micro-segmentation, multi-factor authentication, and data loss protection are examples of technical controls.
  • Administrative controls relate to security policies and procedures. Examples of this could include policies requiring least-privilege and user education.
  • Physical applies to controls you can physically touch. Security badges, security guards, fences, and doors are all examples of physical controls.


Why Defense in Depth?

If you’ve ever looked into setting up an investment strategy, you’ve heard the phrase “Diversify, diversify, diversify.” In other words, you can’t predict if a fund will flop completely, so it’s best to spread funds across a broad category of short-term and long-term investments to minimize the risk of losing all your money on one fund.


Similarly, because you can’t know what vulnerabilities an attacker will try to exploit to gain access to your data or network, it’s best to implement a layered and diverse range of security controls to help minimize the risk.


Here’s a simple example of layered security controls. If an attacker bypassed physical security to gain access to your facility, 802.1x, or a similar port-based security technical control, stops them from simply plugging in a laptop and gaining access to the network.


Off-Premises Concerns

Because of the shared responsibilities for security in a hybrid cloud environment, the cloud adds complexity to the process of designing and implementing a Defense in Depth strategy. While you can’t control how a cloud provider handles the physical security of the facility or facilities hosting your applications, you still have a responsibility to exercise due diligence in the vendor selection process. In addition, SLAs can be designed to act as a deterrent for vendor neglect and malfeasance. However, ultimately, the liability for data loss or compromise rests with the system owner.



When an organization’s culture treats security as more than a compliance requirement, they have an opportunity to build more robust and diverse security controls to protect their assets. Too often, though, organizations fail to recognize security as more than the Information Security team's problem. It's everyone's problem, and thoroughly implementing a Defense in Depth strategy takes an entire organization.

All aboard the hype train—departing soon!


I think most people love to have the latest and greatest thing. Lines outside of Apple stores waiting for each new product is proof enough.


I’m terrible for it myself, like a magpie drawn to a shiny object. If something is new and exciting, I tend to want to check it out. And it doesn’t just apply to tech products, either. I like to read up about new cars as well… but I digress.


So, what’s the purpose of my article here? HYPE! Hype is the bane of anyone’s life whose job it is to identify if a new tech product has any substance to it. I want to try to help identify if you’re looking at marketing hype or something potentially useful.


Does any of this sound familiar?


  • Everything will move to the cloud; on-premises is dead.

  • You don’t need tape for backups anymore.

  • Hyper-converged infrastructure will replace all your traditional three-tier architectures.

  • Flash storage will replace spinning disk.


The above statements are just not true. Well, certainly not as a blanket statement. Of course, there are companies with products out there to help move you in the direction of the hype, but it can be for a problem that never needed to be solved in the first place.


The tech marketing machines LOVE to get a lot of information out into the wild and say their product is the best thing since sliced bread. This seems to be particularly prevalent with start-ups who’ve secured a few successful rounds of venture capital funding and can pump it into marketing their product and bringing it to your attention. Lots of company-branded swag is usually available to try and entice you to take a peek at the product on offer. And who can blame them? At the end of the day, they need to shift product.


Unfortunately, this makes choosing products tough for us IT professionals, like trying to find the diamond amongst the coal. If there’s a lot of chatter about a product, it could be mistaken for word-of-mouth referrals. You know, like, “Hey Jim, have you used Product X? I have and it’s awesome.” The conversation might look more like this if it’s based on hype: “Hey Jim, have you seen Product X? I’m told it’s awesome.”


The key difference here is giving a good recommendation based on fact vs. a bad recommendation based on hearsay. Now, I’m not pooh-poohing every new product out there. There are some genuinely innovative and useful things available. I’m saying, don’t jump on the bandwagon or hype train and buy something just because of a perception in the marketplace that something’s awesome. Put those magpie tendencies to one side and exercise your due diligence. Don’t buy the shiny thing because it’s on some out-of-this-world deal (it probably isn’t). Assess the product on its merits and what it can do for you. Does it solve a technical or business problem you may have? If yes, excellent. If not, just walk away.


A Little Side Note

If I’m attending any IT trade shows with lots of exhibitors, I apply similar logic to identify to whom I want to speak. What are my requirements? Does vendor X look like they satisfy those requirements? Yes: I will go and talk to you. No: Walk right on by. It can save you a lot of time and allows you to focus on what matters to you.

In Austin this week where the weather is boring, just a constant 97F degrees and bright blue skies. The locals tell me it doesn't really get warm until August. I'm not going to stick around to find out.


As always, here are some links I hope you find interesting, enjoy!


People in Japan are renting cars but not driving them

OK, I would never think about renting a car just to take a nap, because we don't rent cars by the hour here in the U.S. Well, not yet, anyway. And at $4 for 30 minutes, I might think differently.


How the Dutch Made Utrecht a Bicycle-First City

I'm always amazed when cities are able to successfully implement bike lanes into their core. Many European cities make it look so easy. I sometimes think how such designs wouldn't work where I live, and then I realize that they *could* work, if the people wanted it to work. Public transportation is broken for much of America; bike lanes offer a partial solution.


Digital license plates now in 3 states, with more on the way

I honestly don't understand what problem they're trying to solve here. Consumers lose privacy, and pay more for that privilege than using regular license plates.


7-Eleven Japan shut down its mobile payment app after hackers stole $500,000 from users

I've said this before, and I'll say it again. Until we hold developers responsible for building applications with poor security choices, we will continue to have incidents like this one.


British Airways faces record £183 million GDPR fine after data breach

Finally, a fine that might cause a company to rethink how they handle security! Thank you GDPR!


Warning: free hotel wifi is a hacker’s dream

As a frequent traveler I can attest that I am uneasy about public and shared Wi-Fi systems such as those in hotels. In a few weeks I'll be at Black Hat, and will probably wrap all my devices in foil, leave them in my room, and burn them on the way to the airport.


User Inyerface - A worst-practice UI experiment

I know you'll hate me for this one, but I want you to try to understand how the things you build are seen by others.


It's Carny season! The best place for $8 draft beers and $6 fried dough.

By Omar Rafik, SolarWinds Senior Manager, Federal Sales Engineering


Here’s an interesting article by my colleague Brandon Shopp about the government’s renewed interest in cloud. He offers considerations to help improve agency readiness and prepare for eventual migration.


A recent report regarding the modernization of Federal Information Technology (IT), coordinated by the American Technology Council (ATC) in 2017, called for agencies to “identify solutions to current barriers regarding agency cloud adoption.”


Couple the report with the White House draft release of a new “Cloud Smart” policy, which updates the “Cloud First” policy introduced in 2010 based on where technology and agencies are today, and once again cloud becomes a primary focus for federal IT pros.


Moving to a cloud environment can bring well-documented advantages, including flexibility, the potential for innovation, and cost savings. Agencies certainly seem to be heading in that direction. According to the 2018 SolarWinds IT Trends Report, IT professionals are prioritizing investments related to hybrid IT and cloud computing.


• 97% of survey respondents listed hybrid IT/cloud among the top five most important technologies to their organization’s IT strategies


• 50% listed hybrid IT/cloud as their most important technologies


That said, barriers may still loom for many federal IT pros. Factors such as current workloads, data center capacity, and the type of applications being used can all affect an agency’s preparedness for a move to the cloud.


How do you know if your agency is ready for cloud adoption?


The Bottom Line


Every agency should be ready to begin assessing its current IT environment and consider starting the journey.


To use an appropriate cliché: there’s no silver bullet. The secret is to move slowly, carefully, and realistically.


Start by examining and completely understanding your infrastructure, applications, and interdependencies. Consider your data center complexity.


Finally, if you’ve made the decision to move to the cloud, how do you know which applications to move first? This decision is easier than it may seem.


There are three primary considerations.


Size – Look at the amount of data your applications accumulate and the amount of storage they take up. The potential for cost savings can be far greater by moving data-heavy applications into the cloud.


Complexity – Consider keeping your most complex, mission-critical applications on-premises until you’ve moved other applications over, and you understand the process and its implications.


Variable usage – Nearly every agency has some applications that experience heavy use during very specific and limited time periods. These are good targets for early migration, as a cloud environment provides the ability to scale up and down; you only pay for what you use. For the same reasons, applications requiring batch processing are also good candidates.


The GSA’s Data Center Optimization Initiative (DCOI) Program Management Office published a whitepaper specifically designed to help agencies with cloud migration. Called “Cloud Readiness: Preparing Your Agency for Migration,” the paper provides strategies for successful migration, including security needs, conducting inventories in advance of migration, and much more about cloud computing solutions for government agencies.




Migrating to a cloud environment is neither quick nor simple; it requires a great amount of time and effort. That said, it’s a project worth undertaking. My advice: perform exhaustive research, make informed decisions, and take it slowly. This strategic, intentional approach will provide the best results for your migration journey—flexibility, opportunities for innovation, and high levels of cost savings.


Find the full article on our partner DLT’s blog Technically Speaking.


The SolarWinds trademarks, service marks, and logos are the exclusive property of SolarWinds Worldwide, LLC or its affiliates. All other trademarks are the property of their respective owners.

IT departments have always been an important part of the business. Critical business models rely on systems managed by IT. Businesses have been using IT for years, but IT is now finding a seat at the table and is helping to lead businesses into a new digital era. As the business puts more pressure on IT to deliver, we're having to re-structure our IT departments. One path that has been drawn up is the Gartner Bimodal IT model.


What Is Bimodal IT?


Gartner released their bimodal IT model back in 2014. It was their attempt to help guide enterprises through the new era of digital transformation. Gartner broke down IT into two different styles of work, defined as "The practice of managing two separate, coherent modes of IT delivery, one focused on stability and the other on exploration."


  1. 1. Mode 1 "focuses on what is known while renovating the legacy environment fit for the digital world."
  2. 2. Mode 2 is "exploratory and experimental to areas of uncertainty."


Mode 1 is focused on legacy applications with a focus on strict recordkeeping. When I first read about Mode 1, I thought of mainframes and legacy code base. These technologies are still very important, but traditionally are hard to add new features and functionality.


Mode 2 is focused on new applications and greenfield deployments, which led me to think about mobile applications and web-based apps.


This might have been a good starting point for enterprises to lead their IT departments into the next wave of technologies and into the digital era, but with the rise of DevOps and hybrid IT, there are gaps in the bimodal model. Gartner's model doesn't really talk about culture and the people needed to interact with each other to support these systems. Bimodal IT doubles the processes because you've created two separate pipelines and you're only as fast as your slowest link. Lastly, new technology practices can be applied using a DevOps and Agile development approach.


What Does Bimodal IT Mean for People?


People are a company's greatest asset, and making people choose between Mode 1 and Mode 2, two distinctly different development paths, is not a good idea.


Mode 1 gives people who are less inclined to learn a place to hide, be comfortable, and not innovate in an area that needs it more than ever. It gives people an excuse to not have to learn new methods or a reason for not knowing because they weren't sent to training or to a conference. Natural silos get built up in all departments of a business and management struggles with trying to have different teams communicate with each other. Now we're encouraging these walls to be built up. Mode 1 might be legacy, but it holds a lot of the critical data that needs to be fed into the newer front-end systems maintained by the Mode 2 group. We need to blend these two groups together to encourage better communication because the systems they interact with are linked.


Process for DevOps


DevOps is about combining processes and tools to help an organization deliver quality applications at a high speed. I normally see groups from Mode 2 and operations referenced when talking about DevOps. We get excited about the new shiny toys in IT and lose sight of the existing infrastructure. Not all deployments are greenfield—they’re normally brownfields. There's already a piece of technology or a process we need to incorporate with our new process. It’s the same with DevOps.


We need to include both Modes 1 and 2 under DevOps and use the same model of development for both modes. Both will benefit when the teams can be merged into a single team and the range of skills can be developed instead of limited to a single practice.


The Interaction of Technologies


When I first read about the Gartner Mode 1, it threw me back to when I worked for a manufacturing company that had a fleet of COBOL programmers who administered the IBM Z-series mainframe. These administrators and this system basically ran the entire organization because they contained the most critical data for the business. But rolling out any new change took a lot of effort. Making the system run faster usually required a hardware refresh. It took months of planning to add new features, and there wasn't an easy way to roll out into production. We maintained two separate pipelines of work. We had two distinctly different groups of people with different skillsets. One side didn't really understand the level of effort of the other side.


I don't think companies are looking at releasing multiple versions of code to their legacy systems. Instead, businesses are looking for their legacy systems to be more flexible and agile in nature—to be able to change when the business changes, pivot when the business needs to pivot. We can learn from each other by working closely with each other. Continuous integration and continuous deployment are a key component of DevOps. CI/CD allows for automated testing using technology to track code versions for quick delivery, stable releases, and quality. Instead of allowing one group benefiting from these new technologies, we need to borrow similar tools and apply to all systems so the business can benefit.




The introduction of the public cloud disrupted enterprise businesses and introduced new models to quickly deliver applications. A new management approach was needed to maintain new applications while supporting legacy systems, and Gartner was one of the first to put out a new model. Since then, new approaches to software development and system management have evolved and become the preferred method. Research has shown it's better for different teams to communicate with one another and share ideas instead of building silos. We've learned that it's good to merge teams from the development side as well as the operations team. Now it's time to include the legacy team.

Management loves uptime, but they rarely want to pay for it. It seems like that line pretty much explains a third of the meetings IT professionals have to sit through.


When we have conversations about uptime, they tend to go something like this:


     IT Worker: What are the uptime requirements for this application?


     Manager: 100%.


     IT Worker: OK, we can do that, but it’s going to cost you about $1,000,000,000,000. What’s the budget code you want me to bill that expense to? (OK, I made up the number, but you get the idea).


     Manager: I’m not paying that much money. You have $35 in annual budget. That’s all we can afford from the budget. Make it happen.


     IT Worker: We can’t get you 100% uptime for $35. For that we can get 9.9% uptime.


At this point, there’s a long discussion about corporate priorities, company spending, the realities of hardware purchasing costs, physics (the speed of light is important for disaster recovery configurations), and, depending on your corporate environment and how personally people take the conversation, something about someone’s parenting skills may come up.


No matter how the discussion goes, this conversation always comes down to the company's need for uptime versus the company’s willingness to pay for the uptime. When it comes to uptime, there has to be a discussion of cost, because uptime doesn’t happen for free. Some systems are more natural to design uptime for than others. With a web tier, for example, we can scale the solution wider and handle the workload through a load balancer.


But what about the hardware running the VMs running your web tier? What if our VM farm is a two-node farm running at 65% capacity? For day-to-day operations, that’s a decent number. But what happens when one of those nodes fails? Now instead of running at 65% capacity, you’re running at 115% capacity. That’s going to be a problem because 15% of the company’s servers (or more) aren’t going to be running because you don’t have the availability to run them. And depending on the support agreement for your hardware, they could be down for hours or days.


Buying another server may be an expensive operation for a company, but how much is that failed server going to cost the company? We may have planned for availability within the application, but if we don’t think about availability at the infrastructure layer, availability at the application layer may not matter.


The converse goes along with this. If we have a new application critical to our business, and the business doesn’t want to pay for availability, will they be happy with the availability of the application if it goes down because a physical server failed? Will they be OK with the application being down for hours or days because there’s nowhere to run the application? Odds are, they won’t be OK with this sort of outage, but the time to address this is before the outage occurs.


  Designing availability for some applications is a lot harder than putting some web servers behind a load balancer. How should HA be handled for file servers, profile servers, database servers, or network links? These quickly become very complex design decisions, but they’re necessary discussions for the systems that need availability. If you build, manage, or own systems that the business cannot afford to go down for a few seconds, much less a few hours, then a discussion about availability, cost, and options needs to happen.

We've talked about a lot in this series about DevOps Tooling:

  1. Start Your DevOps Journey by Looking at the Value You Add

  2. Why Choosing a One-Size-Fits-All Solution Won’t Work

  3. How to Choose the Right Tools in Your DevOps Toolchain

  4. How To Prevent Tool Sprawl in Your DevOps Toolchain


In this installment, we'll talk about how to prevent lock-in for tooling. Lock-in makes a customer dependent on a vendor for products and services, and likewise unable to use another vendor without substantial switching costs.

There are different variations of lock-in. In IT, lock-in is usually created by either a vendor or a technology.


Vendor Lock-In

Vendors try to keep you locked to their products and services by creating a complementary, integrated ecosystem of products. By offering a single service at very low cost, or with very low barriers to entry, customers are drawn in. Once using that initial service or product, it's easier to persuade customers to use more services in the portfolio. A notable, recent example is Amazon AWS. They drew in the masses by offering cheap and easy-to-use cloud object storage services. By offering services adjacent to or on top of S3, AWS made it easy for customers to increase their AWS spend. Other examples include proprietary file formats that only work with the vendor's software.


While these are positive examples, there are many less appealing strategies for vendors to keep you from leaving their products behind. In many cases, they raise barriers by increasing the cost of switching to another service.


Technology Lock-In

In some cases, leaving behind a piece of technology is nearly impossible. Often caused by technical debt or age, technological lock-in is more than the high cost of switching, it’s the impact on business processes. Great examples include the mainframe systems and the COBOL programming language many banks still use. Switching from these (very old) technologies has a big impact on business processes. The risk of switching is simply too high. In many cases, lack of knowledge or documentation on the old systems are the cause of the lock-in. Admins are afraid to touch, change, or upgrade the systems, and won't fix if it ain't broke. If systems are ignored for long enough, the mountain of debt grows so high that switching is no longer viable.


How Do We Prevent Lock-In?

There's an odd parallel between preventing technological lock-in and the DevOps movement itself. If DevOps is about accelerating the delivery of work, making each change as small as possible, incorporating feedback, and learning from data and information, then preventing a technological lock-in is all about continuously changing a legacy system to keep documentation, skills, and experience with those systems up-to-date. It prevents knowledge from seeping away and ensures IT engineers have hands-on experience. Simply put, it's a matter of not ignoring a system, but instead changing and transforming it continuously.


Vendor lock-in, on the other hand, is harder to spot and fight. You should want a certain, manageable, level of lock-in. Vendor lock-in has many advantages: it lowers barriers, cuts costs, and makes it easier to implement. Remember the AWS example? It's easier to spin up a virtual machine in EC2 to process your data, stored in S3, than to move the storage to another cloud provider first. So realistically, there's always lock-in, but it's a matter of how much switching cost an organization is willing to bear.


The Exit-Before-You-Enter Strategy

There's always lock-in, be it technological or vendor-based. The amount of lock-in you're willing to accept depends on the different products you use. To enter a lock-in willingly, you can use the exit-before-you-enter strategy to help you think about what the future switching cost will be, roughly, if you start using a service or product.


The Loosely-Coupled Strategy

By loosely coupling different products or services, there's less lock-in. By using APIs or other standard integrating interfaces between services or applications, switching out one service for another becomes much easier, as long as the interface between them doesn't change significantly. Many DevOps tools for CI/CD, monitoring, and software development offer open APIs that create loosely coupled, but tightly integrated solutions.

Happy 4th of July everyone! I hope everyone reading this is enjoying time well spent with family and friends this week. My plans include landscaping, fireworks, and meat sweats, not necessarily in that order.


As always, here are some links I hope you find interesting.


Your database is going to the cloud and it's not coming back

There's still time to brush up on cloud security, to minimize your risk for a data breach.


When It Comes to Ransomware, Air Gaps Are the Best Defense

Nice reminder about the 3-2-1 rule for backups. Three copies, two different media, and one stored offsite. If everyone followed that rule, there'd be no need to pay any ransom.


US generates more electricity from renewables than coal for first time ever

I'm a huge fan of renewable energy, and I hope to see this trend continue.


Former Equifax executive sentenced to prison for insider trading prior to data breach

But not for allowing the data breach to happen. We need stronger laws regarding responsibility for a data breach.


Fortune 100 passwords, email archives, and corporate secrets left exposed on unsecured Amazon S3 server

As I was just saying...


You’re Probably Complaining the Wrong Way

Sound advice for everyone, especially those of us in IT that need to get a message across to various teams.


14 Fantastic Facts About the Fourth of July

Some interesting pieces of trivia for you to digest along with your traditional holiday salmon dinner.


Happy Birthday America!

By Omar Rafik, SolarWinds Senior Manager, Federal Sales Engineering


Here’s an interesting article by my colleague Mav Turner with suggestions for building a battle-hardened network for the Army. The Army has been using our products to troubleshoot their networks for years, and our tools can help with Mav’s suggestions as well.


The U.S. Army is leading the charge on the military’s multidomain battle concept—but will federal IT networks enable this initiative or inhibit it?


The network is critical to the Army’s vision of combining the defense domains of land, air, sea, space and cyberspace to protect and defend against adversaries on all fronts. As Gen. Stephen Townsend, USA, remarked to AFCEA conference attendees earlier this year, the Army is readying for a future reliant on telemedicine, 3-D printing, and other technologies that will prove integral to multidomain operations. “The network needs to enable all that,” said Townsend.


But the Army’s network, as currently constituted, isn’t quite ready to effectively support this ambitious effort. In response, Maj. General Peter Gallagher, USA, director of the Army's Network Cross Functional Team, has called for a flat network that converges these disparate systems and is more dynamic, automated, and self-healing.


The Army has employed a three-part strategy to solve these challenges and modernize its network, but modernization can open up potential security gaps. As the Army moves forward with readying its network for multidomain battles, IT professionals will want to take several steps to ensure network operations remain rock solid and secure.


Identify and Sunset Outdated Technologies


The Army may want to consider identifying and sunsetting outdated legacy technologies that can introduce connectivity, compatibility, and security issues. Legacy technologies may not be easily integrated into the new network environment, which could pose a problem as the service consolidates its various network resources. They could slow down network operations, preventing troops from being able to access vital information in times of need. They could also introduce security vulnerabilities that may not be patchable.


Update and Test Security Procedures


It’s imperative that Army IT professionals maintain strong and consistent security analysis to ensure the efficacy of new network technologies. This is especially true during the convergence and integration phase, when security holes may be more likely to arise.


Consider utilizing real-world operational testing, event simulation, and red and blue team security operations. Networks are evolutionary, not revolutionary, and these processes should be implemented every time a new element is added to the network.


Monitor the Network, Inside and Out


IT professionals will need to strengthen their existing network monitoring capabilities to identify and remediate issues, from bottlenecks to breaches, quickly and efficiently. They will need to go beyond traditional network monitoring to adopt agile and scalable monitoring capabilities that can be applied across different domains to support different missions.


Look no further than the Army’s Command Post Computing Environment for an initiative requiring more robust monitoring than typical on-premises monitoring capabilities. Similarly, a network that enables multidomain operations will need to be just as reliable and secure as traditional networks, even though the demands placed on the network will most likely be far more intense than anything the Army is accustomed to handling.


For the multidomain concept to succeed, the Army needs a network that can enable the initiative. Building such a network starts with modernization and continues with deploying the necessary processes and technologies to ensure secure and reliable operations. Those are the core tenets of a network built to handle whatever comes its way, from land, air, sea, space, or cyberspace.


Find the full article on SIGNAL.


The SolarWinds trademarks, service marks, and logos are the exclusive property of SolarWinds Worldwide, LLC or its affiliates. All other trademarks are the property of their respective owners.

Binary Noise for Logs.  Photo modified from Pixabay.

Four score and one post ago, we talked about Baltimore’s beleaguered IT department, which is in the throes of a ransomware-related recovery.


Complicating the recovery mission is the fact that the city’s IT team didn't know when the systems were compromised initially. They knew when the systems went offline, but not if the systems were infected earlier. The IT team can’t go back and check a compromised system’s logs because ransomware rendered the infected computers inaccessible.


Anyone who has worked in IT operations knows logs can contain a wealth of valuable information. Logs are usually the first place you go to troubleshoot or detect problems. Logs are where you can find clues of security events. Commonly, though, you can end up having to sift through a lot of data to find the information you need.


In any ransomware or security attack, a centralized logging server or Syslog is an invaluable resource to trace back and correlate events across a plethora of servers and network devices. Aggregation and correlation are jobs for a SIEM.


All About Those SIEMs


SIEM is mostly mentioned as an acronym, not its extended form of Security Information and Event Management tool. SIEMs serve an essential role in security threat detection and commonly make up a valuable part of an organization’s defense-in-depth strategy.


SIEM tools also form the basis for many regulatory auditing requirements for PCI-DSS, HIPAA, and NIST security checks, as well as aid with threat detection.


In a video recording of a session at RSA 2018, a presenter asked the audience who was happy with their current SIEM. When no hands went up, the presenter quipped that maybe the happy people were in the next room.


If I were in that room, I wouldn’t raise my hand either. On a previous contract, our SIEM tool consumed terabytes upon terabytes of space. When it came to time to pull information, the application was slow and unresponsive. Checking the logs ourselves was a more efficient use of time. So, why did we do this? Our SIEM was a compliance checkbox.


Extending SIEMs With UEBA and SOAR


SIEMs are much more than compliance checkboxes. User and Entity Behavior Analytics (UEBA) and Security Orchestration, Automation, and Response (SOAR), when bundled with SIEMs, offer additional features to extend security event management features.


UEBAs look for normal and abnormal behavior for both users and entities to improve visibility across an organization. By using advanced policies and machine learning, UEBAs improve visibility to help protect against insider threats. However, like SIEMs, UEBAs may require fine-tuning to weed out the false positives.


SOARs, on the other hand, are designed to automate and respond to low-level security events or threats. SOARs can provide similar functionality to an Intrusion Detection System (IDS) or Intrusion Prevent System (IPS), without the manual intervention.




At the end of the day, SIEMs, SOARs, UEBAs, and other security tools can be challenging to configure and manage. It makes sense for organizations to begin outsourcing part of this responsibility. Also, you could argue that applications reliant on machine learning belong in cloud-like environments, where you could build large data lakes for additional analytics.


In traditional SIEMs, feeding in more information probably won’t result in a better experience. Without dedicated security analysts to fine-tune the data collected, many organizations struggle with unwieldy SIEMs. While it’s easy to blame the tool, in many cases, the people and processes contribute to the implementation woes. 

The title of this post says it all. In IT, we’ve said for years you need to take and test backups regularly to ensure your systems are being backed up correctly. There’s an adage in IT: “The only good backup is a tested backup.”


Why Do You Need Backups?


I bring this up because I have people coming to our consulting firm after some catastrophic event has occurred, and the first thing I ask for are the backups. Then the conversation usually gets awkward as they try to explain why there are no backups available. The reasons usually run the gamut from “we forgot to set up backups” to “the backup system ran out of space” to “the backup system failed months ago, and no one fixed it.”


Whatever the reason, the result is the same: there are no backups, there’s a critical failure, and no one wants to explain to the boss why they can’t recover the system. The system is down, and the normal recovery options aren’t available for one reason or another. In these cases, when there’s no backup available, the question becomes, “How critical is this to your business staying operational?” If it’s a truly critical part of your infrastructure, then the backups should be just as critical to the infrastructure as the system is. Those backups need to be taken and then tested to ensure the backup solution meets the needs of the company (and that the backups are being taken).


When planning the backups for a key system, the business owners need to be involved in setting the backup and recovery policies; after all, they’re responsible for the data. In IT terms, this is the Recovery Point Objective (RPO) and the Recovery Time Objective (RTO). In layman’s terms, these are how much data can the organization afford to lose and how long it takes to bring the system back online. These numbers can be anything they need to be, but the smaller they are, the larger the financial cost will be in completing this request. If the business wants an RPO and RTO of 0, and the business is willing to pay for it, then it’s IT's job to complete the request even if we don’t agree with it. And that means running test restores of those backups frequently, perhaps very frequently, as we need to ensure the backups we’re taking of the system are working.


Why Is It Important to Test Backups?


Testing backups should be done no matter what kind of backups are being taken. If you think you can skip on test restores because the backup platform reports the backup was successful, then you’re failing at taking backups. One of the mantras of IT is “trust but verify.” We trust that the backup software package did the backup it says it was going to do, but we verify it did the backup by testing it. If there’s a problem where the backup can’t be restored, it’s much better to find out when doing a test restore of the backup than when you need to restore the production system. If you wait until the production system needs to be restored to find out the backup failed, there’s going to be a lot of explaining to do as to why the system can’t be restored and what sort of impact that might have on the business—including, potentially, the company going out of business.

Filter Blog

By date: By tag:

SolarWinds uses cookies on its websites to make your online experience easier and better. By using our website, you consent to our use of cookies. For more information on cookies, see our cookie policy.