Skip navigation
1 2 3 Previous Next

Geek Speak

2,212 posts

By Joe Kim, SolarWinds EVP, Engineering and Global CTO


Though cybercriminals are usually incentivized by financial gain, the reality is that a cyber-attack can create far more damage than just hitting an organization fiscally. This is especially the case when it comes to healthcare organizations. Health data is far more valuable to a cybercriminal, going for roughly 10 or 20 times more than a generic credit card number. Therefore, we can expect to see a surge in healthcare breaches. However, the impact of this won’t just cripple a facility financially. It’s possible a cybercriminal could take over a hospital, manipulate important hospital data, or even compromise medical devices.


It’s already started

These sort of breaches are already happening. At the start of 2016, three UK hospitals in Lincolnshire managed by the North Lincolnshire and Goole NHS Foundation Trust were infected by a computer virus. The breach was so severe it resulted in hundreds of planned operations and outpatient appointments being cancelled.


The event, which officials were forced to deem as a “major incident,” also made it difficult to access test results and identify blood for transfusions, and some hospitals struggled to process blood tests. This is one of the first examples of a healthcare cyber security breach directly impacting patients in the UK, but it won’t be the last.


Follow in the footsteps of enterprises

Breaches like these have put a great deal of pressure on healthcare IT professionals. Though there has been a shift in mentality in enterprise, with security becoming a priority, the same can’t be said for the healthcare sector. This needs to change. The situation is worsened with most healthcare organizations often having budget cuts, which make security a hard thing to prioritize.


It doesn’t need to break to be fixed

Many healthcare IT professionals assume that management will only focus on security once a significant breach occurs, but it’s time healthcare organizations learned from enterprises that have seen breaches occur and acted. In the meantime, there is work that requires little investment that IT professionals can do to protect the network.


Educate and enforce

Employees are often the weakest link when it comes to security in the workplace. An awareness campaign should encompass both education and enforcement. By approaching an education initiative in this way, employees will have a better understanding of potential threats that could come from having an unauthorized device connected to the network.


For example, healthcare workers need to be shown how a cybercriminal could infiltrate the network through hacking someone’s phone. This would also start a dialogue between healthcare employees, helping them to prioritize security and thus giving the IT department a better chance of protecting the organization from a breach.


It’s naturally assumed that a healthcare IT professional should be able to effectively protect his or her organization from an attack. However, even the most experienced security professional would struggle to do so without the right tools in place. To protect healthcare organizations from disastrous attacks requires funding, investment, and cooperation from employees.


Find the full article on Adjacent Open Access.

I'm not aware of an antivirus product for network operating systems, but in many ways, our routers and switches are just as vulnerable as a desktop computer. So, why don't we all protect them in the same way as our compute assets? In this post, I'll look at some basic tenets of securing the network infrastructure that underpins the entire business.


Authentication, Authorization, and Accounting (AAA)


Network devices intentionally leave themselves open to user access, so controlling who can get past the login prompt (Authentication) is a key part of securing devices. Once logged in, it's important to control what a user can do (Authorization). Ideally, what the user does should also be logged (Accounting).


Local Accounts Are Bad, Mmkay?


Local accounts (those created on the device itself) should be limited solely to backup credentials that allow access when the regular authentication service is unavailable. The password should be complex and changed regularly. In highly secure networks, access to the password should be restricted (kind of a "break glass for password" concept). Local accounts don't automatically disable themselves when an employee leaves, and far too often, I've seen accounts still active on devices for users who left the company years ago, with some of those accessible from the internet. Don't do it.


Use a Centralized Authentication Service


If local accounts are bad, then the alternative is to use an authentication service like RADIUS or TACACS. Ideally, those services should in turn defer authentication to the company's existing authentication service, which in most cases, is Microsoft Active Directory (AD) or a similar LDAP service. This not only makes it easier to manage who has access in one place, but by using things like AD groups, it's possible to determine not just who is allowed to authenticate successfully, but what access rights they will have once logged in. The final, perhaps obvious, benefit is that it's only necessary to grant a user access in one place (AD), and they are implicitly granted access to all network devices.


The Term Process


A term (termination) process defines the list of steps to be taken when an employee leaves the company. While many of the steps relate to HR and payroll, the network team should also have a well-defined term process to ensure that after a network employee leaves, things such as local fallback admin passwords are changed, or perhaps SNMP read/write strings are changed. The term process should also include disabling the employee's Active Directory account, which will also lock them out of all network devices, because we're using an authentication service that authenticates against AD. It's magic! This is a particularly important process to have when an employee is terminated by the company, or may for any other reason be disgruntled.


Principal of Least Privilege


One of the basic security tenets is the principal of least privilege, which in basic terms, says Don't give people access to things unless they actually need it; default to giving no access at all. The same applies to network device logins, where users should be mapped to the privilege group that allows them to meet their (job) goals, while not granting permissions to do anything for which they are not authorized. For example, a NOC team might need read-only access to all devices to run show commands, but they likely should not be making configuration changes. If that's the case, one should ensure that the NOC AD group is mapped to have only read-only privileges.


Command Authorization


Command authorization is a long-standing security feature of Cisco's TACACS+, and while sometimes painful to configure, it can allow granular control of issued commands. It's often possible to configure command filtering within the network OS configuration, often by defining privilege levels or user classes at which a command can be issued, and using RADIUS or TACACS to map the user into that group or user class at login. One company I worked for created a "staging" account on Juniper devices, which allowed the user to enter configuration mode and enter commands, and allowed the user to run commit check to validate the configuration's validity, but did not allow an actual commit to make the changes active on the device. This provided a safe environment in which to validate proposed changes without ever having the risk of the user forgetting to add check to their commit statement. Juniper users: tell me I'm not the only one who ever did that, right?


Command Accounting


This one is simple: log everything that happens on a device. More than once in the past, we have found the root cause of an outage by checking the command logs on a device and confirming that, contrary to the claimed innocence of the engineer concerned, they actually did log in and make a change (without change control either, naturally). In the wild, I see command accounting configured on network devices far less often than I would have expected, but it's an important part of a secure network infrastructure.


NTP - Network Time Protocol


It's great to have logs, but if the timestamps aren't accurate, it's very difficult to align events from different devices to analyze a problem. Every device should be using NTP to ensure that they have an accurate clock to use. Additionally, I advise choosing one time zone for all devices—servers included—and sticking to it. Configuring each device with its local time zone sounds like a good idea until, again, you're trying to put those logs together, and suddenly it's a huge pain. Typically, I lean towards UTC (Coordinated Universal Time, despite the letters being in the wrong order), mainly because it does not implement summer time (daylight savings time), so it's consistent all year round.


Encrypt All the Things


Don't allow telnet to the device if you can use SSH instead. Don't run an HTTP server on the device if you can run HTTPS instead. Basically, if it's possible to avoid using an unencrypted protocol, that's the right choice. Don't just enable the encrypted protocol; go back and disable the unencrypted one. If you can run SSHv2 instead of SSHv1, you know what to do.


Password All the Protocols


Not all protocols implement passwords perfectly, with some treating them more like SNMP strings. Nonetheless, consider using passwords (preferably using something like MD5) on any network protocols that support it, e.g., OSPF, BGP, EIGRP, NTP, VRRP, HSRP.


Change Defaults


If I catch you with SNMP strings of public and private, I'm going to send you straight to the principal's office for a stern talking to. Seriously, this is so common and so stupid. It's worth scanning servers as well for this; quite often, if SNMPd is running on a server, it's running the defaults.


Control Access Sources


Use the network operating system's features to control who can connect to them in the first place. This may take the form of a simple access list (e.g., a vty access-class in Cisco speak) or could fall within a wider Control Plane Policing (CoPP) policy, where control for any protocol can be implemented. Access Control Lists (ACLs) aren't in themselves secure, but it's another step to overcome for any bad actor wishing to illicitly connect to the devices. If there are bastion management devices (aka jump boxes), perhaps make only those devices able to connect. Restrict from where SNMP commands can be issued. This all applies doubly for any internet-facing devices, where such protections are crucial. Don't allow management connections to a network device on an interface with a public IP. Basically, protect yourself at the IP layer as well by using passwords and AAA.


Ideally, all devices would be managed using their dedicated management ports, accessed through a separate management network. However, not everybody has the funding to build an out-of-band management network, and many are reliant on in-band access.


Define Security Standards and Audit Yer Stuff


It's really worth creating a standard security policy (with reference configurations) for the network devices, and then periodically auditing the devices against it. If a device goes out of compliance is that a mistake or did somebody intentionally weaken the device security posture? Either way, just because a configuration was implemented once, it would be risky to assume it had remained in place from then on, so a regular check is worthwhile.


Remember Why


Why are we doing all of this? The business runs over the network. If the network is impacted by a bad actor, the business can be impacted in turn. These steps are one part of a layered security plan; by protecting the underlying infrastructure, we help maintain availability of the applications. Remember the security CIA triad —Confidentiality, Integrity, and Availability? The steps I have outlined above—and many more that I can think of—help maintain network availability and ensure that the network is not compromised. This means that we have a higher level of trust that the data we entrust to the network transport is not being siphoned off or altered in transit.


What steps do you take to keep your infrastructure secure?

Previously, I discussed the origins of the word “hacking” and the motivations around it from early phone phreakers, red-boxers, and technology enthusiasts.


Today, most hackers can be boiled down to Black Hats and White Hats. The hat analogy comes from old Western movies, where the good guys wore white and the bad guys wore black. Both groups have different reasons for hacking.


Spy vs. Spy

The White Hat/Black Hat analogy always makes me think of the old Spy vs. Spy comic in Mad Magazine. These two characters—one dressed all in white, the other all in black—were rivals who constantly tried to outsmart, steal from, or kill each other. The irony was that there was no real distinction between good or evil. In any given comic, the White Spy might be trying to kill the Black Spy or vice versa, and it was impossible to tell who was supposed to be the good guy or the bad guy.


Black Hat hackers are in it to make money, pure and simple. There are billions of dollars lost every year to information breaches, malware, cryptoware, and data ransoming. Often tied to various organized crime syndicates (think Russian Mafia and Yakuza), these are obviously the “bad guys” and the folks that we, as IT professionals, are trying to protect ourselves and our organizations from.


The White Hats are the “good guys," and if we practice and partake in our own hacking, we would (hopefully) consider ourselves part of this group. Often made up of cybersecurity and other information security professionals, the goal of the White Hat is to understand, plan for, predict, and prevent the attacks from the Black Hat community.


Not Always Black or White

There does remain another group of people whose hacking motivations are not necessarily determined by profit or protection, but instead, are largely political. These would be the Gray Hats, or the hackers who blur the distinction between black and white, and whose designation as “good or bad” is subjective and often depends on your own point of view. As I mentioned, the motivation for these groups is often political, and their technical resources are frequently used to spread a specific political message, often at the expense of a group with an opposing view. They hack websites and social media accounts, and replace their victims’ political messaging with their own.


Groups like Anonymous would fall into this category, the Guy Fawkes mask-wearing activists who are heavily involved in world politics, and who justify their actions as vigilantism. Whether you think what they do is good or not depends on your own personal belief structure, and which side of the black/white spectrum they land on is up to you. It’s important to consider such groups when trying to understand motivation and purpose, if you decide to embark on your own hacking journey.


What’s in It for Us?

Because hacking has multiple meanings, which approach do we take as IT pros when we sit down for a little private hacking session? For us, it should be about learning, solving problems, and dissecting how a given technology works. Let’s face it: most of us are in this industry because we enjoy taking things apart, learning how they work, and then putting them back together. Whether that’s breaking down a piece of hardware like a PC or printer, or de-compiling some software into its fundamental bits of code, we like to understand what makes things tick, and we’re good at it. Plus, someone actually pays us to do this!


Hacking as part of our own professional development can be extremely worthwhile because it helps us gain a deep understanding of a given piece of technology. Whether it is for troubleshooting purposes, or for a deep dive into a specific protocol while working toward a certification, hacking is one more tool you can use to become better at what you do.


Techniques you use in your everyday work may already be considered “hacks." Some tools you may have at your disposal may potentially be the same tools that hackers use in their daily “work." Have you ever fired up Wireshark to do some packet capturing? Used a utility from a well-known tool compilation to change a lost Windows password? Scanned a host on your network for open ports using NMAP? All of these are common tools that can be used by the IT professional to accomplish a task, or a malicious hacker trying to compromise your environment.


As this series continues, we will look at a number of different tools—both software and hardware—that have this kind of utility, and how you can use these in a way that will improve your understanding of the technology you support, as well as developing a respect for the full spectrum of hacking that may impact your business or organization.


There are some fun toys out there, but make sure to handle them with care.


As always, "with great power comes great responsibility." Please check your local, state, county, provincial, and/or federal regulations regarding any of the methods, techniques, or equipment outlined in these articles before attempting to use any of them, and always use your own private, isolated test/lab environment.

Hey everybody! It’s me again! In my last post, "Introducing Considerations for How Policy Impacts Healthcare IT," we started our journey discussing healthcare IT from the perspective of the business, as well as the IT support organization. We briefly touched on HIPAA regulations, EMR systems, and had a general conversation about where I wanted to take this series of posts. The feedback and participation from the community was AMAZING, and I hope we can continue that In this post. Let's start by digging a bit deeper into two key topics (and maybe a tangent or two): Protecting data at rest and in motion.


Data at Rest

When I talk about data at rest, what exactly am I referring to? Well, quite frankly, it could be anything. We could be talking about a Microsoft Word document on the hard drive of your laptop that contains a healthcare pre-authorization for a patient. We could be talking about medical test results from a patient that resides in a SQL database in your data center. We could even be talking about the network passwords document on the USB thumb drive strapped to your key chain. (Cringe, right?!) Data at rest is just that: it’s data that’s sitting somewhere. So how do you protect data at rest? Let us open that can of worms and talk about that, shall we?


By now you’ve heard of disk encryption, and hopefully you’re using it everywhere. It’s probably obvious to you that you should be using disk encryption on your laptop, because what if you leave it in the back seat of your car over lunch and it gets stolen? You can’t have all that PHI getting out into the public, now can you? Of course not! But did you take a minute to think about the data stored on the servers in your data center? While it might not be as likely that somebody swipes a drive out of your RAID array, it CAN happen. Are you prepared for that? What about your SAN? Are those disks encrypted? You’d better find out.


Have you considered the USB ports on your desktop computers? How hard would it be for somebody to walk in with a nice 500gb thumb drive, plug it into a workstation, and grab major chunks of sensitive information in a very short period of time, and simply walk out the front door? Not very hard if you’re not doing something to prevent that. There are a bunch of scenarios we haven’t talked about, but at least I've made you think about data at rest a little bit now.


Data in Motion

Not only do we need to protect our data at rest, we also need to protect it in motion. This means we need to talk about our networks, particularly the segments of those networks that cross public infrastructure. Yes, even "private lines" are subject to being tapped. Do you have VPN connectivity, either remote-access (dynamic) or static to remote sites and users? Are you using an encryption scheme that’s not susceptible to man-in-the-middle or other security attacks? What about remote access connections for contractors and employees? Can they just "touch the whole network" once their VPN connection comes up, or do you have processes and procedures in place to limit what resources they can connect to and how?


These are all things you need to think about in healthcare IT, and they’re all directly related to policy. (They are either implemented because of it, or they drive the creation of it.) I could go on for hours and talk about other associated risks for data at rest and data in motion, but I think we’ve skimmed the surface rather well for a start. What are you doing in your IT environments to address the issues I’ve mentioned today? Are there other data at rest or data in motion considerations you think I’ve omitted? I’d love to hear your thoughts in the comments!


Until next time!

The SolarWinds Virtualization Monitoring with Discipline VMworld tour is about to start and we are bringing solutions, SMEs, and swag.


VMworld US

At VMworld Las Vegas, the SolarWinds family is bringing a new shirt, new stickers, buttons, socks, and a new morning event. And that’s not all we’re bringing to VMworld.


  • Join us on Tuesday morning for the inaugural Monitoring Morning as KMSigma and I talk about monitoring at scale and troubleshooting respectively.

  • Next, don’t forget to attend sqlrockstar's two speaking sessions. He'll speak about monster database VMs, and join a panel session on best practices when virtualizing data. Also, be sure to check out chrispaap's talk on mastering the virtual universe using foundational skills, such as monitoring with discipline.


    • Solutions Exchange
      Monday, August 28     2:50 – 3:10 p.m.
      Chris Paap

Monitoring With Discipline To Master your Virtualized Universe

  • Tuesday, August 29     11:30am – 12:30 p.m.
    Thomas LaRock

Performance Tuning and Monitoring for Virtualized Database Servers

  • Wednesday, August 30     4:00 – 5:00 p.m.
    Thomas LaRock

SQL Server on vSphere: A Panel with Some of the World’s Most Renowned Experts

  • Lastly, visit us at booth number 224 to talk to our SMEs, get your questions answered, and pick up your swag.

VMworld Europe

Another first is that SolarWinds will be on the Solutions Expo floor at VMworld Europe in Barcelona. In the lead-up to the event, we’ll be hosting a pre-VMworld Europe webcast to talk shop about Virtualization Manager and its virtue for empowering troubleshooting in the highly virtualized domain of hybrid IT.

  • sqlrockstar will again be speaking in the following session.
    • Wednesday, September 13        12:30 – 1:30 p.m.

Thomas LaRock

Performance Tuning and Monitoring for Virtualized Database Servers

  • chrispaap and I, along with our Solarwinds EMEA SMEs, will be in the booth to answer your questions, talk shop about monitoring with discipline, and hand out swag.


I’ll update this section with details as they become available.


Let me know in the comment section if you will be in attendance at VMworld US or VMworld Europe. If you can’t make it to one of these events, let me know how we at SolarWinds can better meet and exceed your virtualization pain points.

The cloud is no longer a new thing. Now, we’re rapidly moving to an “AI-first” world. Even Satya Nadella updated the Microsoft corporate vision recently to say “Our strategic vision is to compete and grow by building best-in-class platforms and productivity services for an intelligent cloud and an intelligent edge infused with AI.” Bye bye cloud first, mobile first.


In reality, some organizations still haven't taken the plunge into cloud solutions, even if they want to. Maybe they’ve had to consolidate systems or remove legacy dependencies first. The cloud is still new to them. So, what advice would you give to someone looking at cloud for the first time? Have we learned some lessons along the way? Has cloud matured from its initial hype, or have we just moved on to new cloud-related hype subjects (see AI)? What are we now being told (and sold) that we are wary of until it has had some time to mature?


Turn off your servers
Even in the SMB market, cloud hasn’t resulted in a mass graveyard of on-premises servers. Before advising the smallest of organizations on a move to the cloud, I want to know what data they generate, how much there is, how big it is, and what they do with it. That knowledge, coupled with their internet connection capability, determines if there is a case for leaving some shared data or archive data out of the cloud. That’s before we’ve looked at legacy applications, especially where aging specialist hardware is concerned (think manufacturing or medical). I’m not saying it’s impossible to go full cloud, but the dream and the reality are a little different. Do your due diligence wisely, despite what your friendly cloud salesperson says.


Fire your engineers
Millions of IT pros have not been made redundant because their organizations have gone to the cloud. They’ve had to learn some new skills, for sure. But even virtual servers and Infrastructure as a Service (IaaS) requires sizing, monitoring, and managing. The cloud vendor is not going to tell you that your instance is over-specced and you should bump it down to a cheaper plan. Having said that, I know organizations that have slowed down their hiring because of the process efficiencies they now have in place with cloud and/or automation. We don’t seem to need as much technical head count per end-user to keep the lights on.


Virtual desktops
Another early cloud promise was that we could all run cheap, low-specced desktops with a virtual desktop in the cloud doing all the processing. Yes, it sounded like terminal services to me too, or even back to dumb terminal + mainframe days. Again, this is a solution that has its place (we’re seeing it in veterinary surgeries with specialist applications and Intel Compute Sticks). But it doesn’t feel like this cloud benefit has been widely adopted.


Chatbots are your help desk
It could be early days for this one. Again, we haven’t fired all of the Level 1 support roles and replaced them with machines. While they aren’t strictly a cloud-move thing (other than chatbots living in the cloud), there is still a significant amount of hype around chatbots being our customer service and ITSM saviors. Will this one fizzle out, or do we just need to give the bots some more time to improve (knowing ironically that this happens the best when we use them and feed them more data)?


Build your own cloud
After being in technical preview for a year, Microsoft has released the Azure Stack platform to its hardware partners for certification. Azure Stack gives you access to provision and manage infrastructure resources like you’d do in Azure, but those resources are in your own data center. There’s also a pay-as-you-go subscription billing option. The technical aspects and use cases seem pretty cool, but this is a very new thing. Have you played with the Azure Stack technical preview? Do you have plans to try it or implement it?


So, tell me the truth
One thing that has become a cloud truth is automation, whether that’s PowerShell scripts, IFTTT, or Chef recipes. While much of that automation is available on-premises, too (depending on how old your systems are), many Software-as-a-Service (SaaS) solutions are picked over on-premises for their interoperability. If you can pull yourself away from GUI habits and embrace the console (or hand your processes off to a GUI like Microsoft Flow), those skills are a worthwhile investment to get you to cloud nirvana.


I’ve stayed vendor-agnostic on purpose, but maybe you have some vendor-specific stories to share? What cloud visions just didn’t materialize? What’s too “bleeding edge” now to trust yet?

Back from Austin and THWACKcamp filming, and now gearing up for VMworld. I've got one session and a panel discussion. If you are attending VMworld let me know, I'd love to connect with you while in Vegas. if you time it right, you may catch me on my way to a bacon snack.


As always, here's a bunch of links I think you might enjoy.


Don't Take Security Advice from SEO Experts or Psychics

There's a LOT of bad advice on the internet folks. Take the time to do the extra research, especially when it comes to an expert offering expert opinions for free.


10 Things I’ve Learned About Customer Development

"What features your customers ask for is never as interesting as why they want them." Truth.


Researchers encode malware in DNA, compromise DNA sequencing software

This is why we can't have nice things.


An Algorithm Trained on Emoji Knows When You’re Being Sarcastic on Twitter

Like we even need such a thing.


Password guru regrets past advice

It's not just you, we all regret this advice.


The InfoSec Community is Wrong About AI Being Hype

There's more than one tech community that is underestimating the impact that Ai and Machine Learning will have on our industry in the next 5 to 8 years.


Researchers Find a Malicious Way to Meddle with Autonomous Tech

Then again, if we can keep fooling systems with tricks like this, maybe it will be a bit longer before the machines take over.


I think I'm going to play this game every time I visit Austin.

By Joe Kim, SolarWinds EVP, Engineering and Global CTO


The technology that government end-users rely on is moving beyond the bounds of on-premises infrastructures, yet employees still hold IT departments accountable for performance.


According to a recent SolarWinds “IT is Everywhere” survey of government IT professionals, 84 percent say the expectation to support end-users’ personal devices connecting to agency networks is greater than it was 10 years ago. The survey also found that 70% of IT pros estimate that end-users use non-IT sanctioned, cloud-based applications at least occasionally.


Here are more insights from federal IT pros:


  • 63% claim end-users expect work-related applications used remotely to perform at the same level (or better) than they do in the office
  • 79% say they provide support to remote workers at least occasionally
  • 53% say end-users expect the same time-to-resolution for issues with both cloud-based applications and local applications managed directly by IT
  • 40% say end-users expect the same time-to-resolution for issues with both personal and company-owned devices and technology
  • 68% claim to provide at least occasional support for personal devices


All of this amounts to a tall order for government IT professionals. However, there are some strategies to help ensure that users are happy and productive while agency systems remain secure.


Closely monitor end-user devices


User device tracking can provide a good security blanket for those concerned about unsanctioned devices. IT professionals can create watch lists of acceptable devices and be alerted when rogue devices access their networks. They can then trace those devices back to their users. This tracking can significantly mitigate concerns surrounding bring-your-own-device security.


Gain a complete view of all applications


Having a holistic view of all applications results in a better understanding of how the performance of one application may impact the entire application stack. Administrators will also be able to quickly identify and rectify performance issues and bottlenecks.


Beyond that, administrators must also account for all of the applications that users may be accessing via their personal devices, such as social media apps, messaging tools, and others. Network performance monitoring and network traffic analysis can help IT managers detect the causes behind quality-of-service issues and trace them back to specific applications, devices, and users.


Look out for bandwidth hogs


IT managers should make sure their toolkits include network performance and bandwidth monitoring solutions that allow them to assess traffic patterns and usage. If a slowdown or abnormality occurs, administrators can take a look at the data and trace any potential issues back to individual users or applications. They can then take action to rectify the issue.


Fair or not, IT pros are officially the go-to people whenever a problem arises. While IT managers may not be able to do everything their end-users expect, they can certainly lay the groundwork for tackling most challenges and creating a secure, reliable, and productive environment.


Find the full article on Government Computer News.

During times of rapid increase in technology, it is better to be a generalist than a specialist. You need to know a little bit about a lot of different things. This is most applicable to any database administrator who is responsible for managing instances in the cloud. These DBAs need to add to their skills the ability to quickly troubleshoot network issues that may be affecting query performance.


In my THWACKcamp 2017 session, "Performance Tuning the Accidental Cloud DBA," fellow Head Geek Leon Adato and I will discuss the skills that are necessary for DBAs to have and practice over the next three to five years.


We are continuing our expanded-session, two-day, two-track format for THWACKcamp 2017. SolarWinds product managers and technical experts will guide attendees through how-to sessions designed to shed light on new challenges, while Head Geeks and IT thought leaders will discuss, debate, and provide context for a range of industry topics.


In our 100% free, virtual, multi-track IT learning event, thousands of attendees will have the opportunity to hear from industry experts and SolarWinds Head Geeks -- such as Leon and me -- and technical staff. Registrants also get to interact with each other to discuss topics related to emerging IT challenges, including automation, hybrid IT, DevOps, and more.


Check out our promo video and register now for THWACKcamp 2017! And don't forget to catch my session!

Best practices, I feel, mean different things to different people. For me, best practices are a few things. They are a list of vendor recommendations for product implementation, they come from my own real-world experiences, and they are informed by what I see my peers doing in the IT community. I have learned to accept that not all best practices come from vendors, and that the best practices list I have compiled is essentially a set of implementation guidelines aimed at ensuring the highest quality of deployment.


So how does this apply to virtualization and the best practices you follow? Let’s chat!


Getting ready to virtualize your servers or workstations?


According to Gartner, enterprise adoption of server virtualization has nearly doubled in the past couple of years. That doesn’t even include workstation virtualization which is also becoming more relevant to the enterprise as product options mature.  So, if your organization isn’t virtualizing an operating system today, it’s highly probable that it will in the future. Understanding how to prepare for this type of business transformation according to the latest best practices/guidelines will be key to your deployment success.


Preparing according to best practices/guidelines


As mentioned, it’s important to have a solid foundation of best practices/guidelines for your virtualization implementation. Diving right in, here are some guidelines that will get you started with a successful virtualization deployment:


  • Infrastructure sizing – Vendors will provide you with great guidance on where to begin sizing your virtual environment, but at the end of the day, all environments are different. Take time to POC/Test within your environment and build out your resource calculations.  Also, be sure to involve the business users to help ensure that you are providing the ultimate performance experience before you finalize your architectural design. Also, when sizing, don’t use averages. You will come up short and performance will suffer.


  • Know your software – A key part of the performance you will get from your virtualized environment will depend on the applications you are running.  It’s important to baseline test to obtain a solid list of applications in your environment. Then take this a step further to understand the number of resources used by your applications. You can see that even the smallest software upgrade can impact performance by looking at the following example: Microsoft Office 2016 consumes up to 20% more resources than previous versions (2007/2010). That’s a big deal if it wasn’t considered in advance because it could severely impact the user performance experience.


  • Image Management – One of the best things about virtualization is that it can greatly reduce your work effort when it comes to patch management and operating system maintenance. The value of this can only be seen when you deploy as few operating systems as possible. So, when you are deciding on use cases, keep this in mind.


  • Use application whitelisting instead of anti-virus – Anti-virus solutions have proven to impact the performance of virtualization environments. If you must run something at the operating system level, I would strongly suggest using application whitelist instead. Having an enforced approved list of applications can provide a more secure platform without taking a performance hit.


  • Protect your data – You just spent all this time deploying virtualization to make sure that your virtualization databases are backed up. Heck, your entire environment should be backed up. Taking this even one step further, be sure to include high availability and even disaster recovery in your design. In my experience, if an environment isn’t ready for the worst, you can end up in a pretty bad situation that could include an entire rebuild. If you cannot afford the business downtime in a worst-case scenario, then button things up to be sure that your plan includes proper data protection.


  • The right infrastructure – Vendors are pretty good about creating guidelines about the type of infrastructure their virtualization platforms will run on, but I strongly suggest that you take a look at both hyper-converged infrastructure, and use of GPUs. If you expect the performance of your virtual systems (especially with virtual workstations) to be the same as what your users experience today, these infrastructure options should at least be part of your conversation. They'll likely end up being a part of your deployment design.


  • Automate everything you can – Automation can be a very powerful way to help ensure that you are using your time efficiently and wisely. When it comes to automation, keep the following in mind: If you are going to do manual automation, remember that there is a certain amount of time being spent to complete the work. In some cases, if there is a third-party tool that can help with automation, that may be worth considering. Third-party automation tools typically come with an upgrade path that you won’t get when you home grow your code. And when the person that wrote the code leaves, there goes that support, too. There isn’t one single answer here. Just remember that automation is important, so you should be thinking about this if you aren’t already


For virtualization success, be sure to fully research your environment up front. This research will help you easily determine if any/all of the above best practices/guidelines will create success for your virtualization deployment. Cheers!

Working in IT is such a breeze. The industry never changes, and the infrastructure we work with doesn’t impact anyone very much. It’s really a pleasure cruise more than anything else.


And if you believed that, you’ve probably never worked in IT.


The reality for many IT professionals is the opposite. Our industry is in a constant state of change, and from the Level One helpdesk person to the senior architect, everything we do impacts others in significant ways. Corporate IT, for many, is a maze of web conferences, buzzwords, and nontechnical leadership making technical decisions.


Not losing my mind in the midst of all this sometimes feels just as much a part of my professional development plan as learning about the next new gadget. Over the last 10 years, I’ve developed some principles to help me negotiate with this dynamic, challenging, fulfilling, but also frustrating and unnerving world.


Own my own education


The first principle is to own my own education. One thing I've had to settle with deep down inside is that I’ll never really “arrive” in technology. There is no one degree or certification that I can earn that covers all technology, let alone all future technology. That means that I’ve had to adopt a personal routine of professional development apart from my employer to even attempt to keep pace with the changes in the industry.


Now that I have three children and work in a senior position, it’s much harder to maintain consistent motivation to keep pushing. Nevertheless, what I’ve found extremely helpful is having a routine that makes a never-ending professional development plan more doable. For me, that means setting aside time every morning before work or at lunch to read, work in a lab, or watch training videos. This way, my professional development doesn’t impede much on family life or compete with the many other obligations I have in life. This requires getting to bed at a reasonable time and getting my lazy rear end out of bed early, but when I get in the routine, it becomes easier to maintain.


I don’t rely on my colleagues, my employer, or my friends to hand me a professional development plan or motivate me to carry it out. This is a deeply personal thing, but I’ve found that adopting this one philosophy has changed my entire career and provided a sense of stability.


Choosing what to focus on is a similar matter. There’s a balance between what you need to learn for your day job and the need to continually strengthen foundational knowledge. For example, my day job may require that I get very familiar with configuring a specific vendor’s platform. This is perfectly fine since it’s directly related to my ability to do my job well, but I’ve learned to make sure to do this without sacrificing my personal goal to develop foundational networking knowledge.




The leads into my second principle: community. Engaging in the networking community on Twitter, through blog posts, and in Slack channels gives me an outlet to vent, bounce ideas around, and find serious inspiration from those who’ve gone before me. It also helps me figure out what I should be working on in my professional development plan.


I’d be remiss if I didn’t mention that too much engagement in social media can be detrimental because of how much time it can consume, but when done properly and within limits, reading my favorite networking blogs, interacting with nerds on Twitter, and doing my own writing has been instrumental in refining my training plan.


Taking a break


A third principle is taking a break. A friend of mine calls it taking a sanity day.


I can get burned out quickly. Normally, it’s not entirely because of my day job, either. Working in IT means I’m stressed about an upcoming cutover, getting some notes together for this afternoon’s web conference, ignoring at least a few tickets in the queue, worried I’ll fail the next cert exam and waste $400, and concerned that I’m not progressing in my professional development the way I'd hoped.


For years I just kept pushing in all these areas until I’d snap and storm out of the office or find myself losing my temper with my family. I’ve learned that taking a few days off from all of it has helped me tremendously.


For me, that’s meant going to work but being okay with delegating and asking for some help on projects. It’s also meant backing off social media a bit and either pausing the professional development routine for a few days or working on something unrelated.


Recently I’ve mixed learning Python into my routine, and I’ve found that a few days of that is an amazing mental break when I can’t bear to dial into another conference bridge or look at another CLI. And sometimes I need to shut it all down and just do some work in the yard. 


This isn’t giving up. This isn’t packing it in and waiting to retire. This is taking some time to decrease the noise, to think, to re-evaluate, and to recuperate.


I admit that I sometimes feel like I need permission to do this. Permission from who? I’m not sure, but it’s sometimes difficult for me to detach. After I do, though, I can go back to the world of five chat windows, back-to-back meetings, and all of the corporate IT nonsense with a new energy and a better attitude. 


These are principles I’ve developed for myself based on my own experiences, so I’d love to learn how others work in IT without losing their minds, as well. IT is constantly changing, and from the entry level folks to the senior staff, everything we do impacts others in significant ways. How do you navigate the maze of web conferences, buzzwords, and late night cutovers?    

My name is Josh Kittle and I’m currently a senior network engineer working for a large technology reseller. I primarily work with enterprise collaboration technologies, but my roots are in everything IT. For nearly a decade, I worked as a network architect in the IT department of one of the largest managed healthcare organizations in the United States. Therefore, healthcare security policy, the topic I’m going to introduce to you here today, is something I have quite a bit of experience with. More specifically, I’m going to talk about healthcare security concerns in IT, and how IT security is impacted by the requirements of healthcare, and conversely, how health care policy is impacted by IT initiatives. My ultimate goal is to turn this into a two-way dialogue. I want to hear your thoughts and feedback on this topic (especially if you work in healthcare IT) and see if together we can take this discussion further!


Over the next five posts, I’m going to talk about a number of different considerations for healthcare IT, both from the perspective of the IT organization and the business. In a way, the IT organization is serving an entirely different customer (the business) than the business is serving (in many cases, this is the consumer, but in other cases, it could be the providers). Much of the perspective I’m going to bring to this topic will be specific to the healthcare system within the United States, but I’d love to have a conversation in the forum below about how these topics play out in other geographical areas, for those of you living in other parts of the world. Let’s get started!


There are a number of things to consider as we prepare to discuss healthcare policy and IT, or IT policy and health care for that matter since we’re going to dip our toes into both perspectives. Let's start by talking about IT policy and health care. A lot of the same considerations that are important to us in traditional enterprise IT apply in healthcare IT, particularly around the topic of information security. When you really think about it, information security is as much a business policy as it is something we deal with in IT,  and information security is a great place to start this discussion. Let me take a second to define what I mean by information security. Bottom line, information security is the concept of making sure that information is available to the people who need it while preventing access to those who shouldn’t have it. This means protecting both data-at-rest as well as data-in-motion. Topics such as disk encryption, virtual private networks, as well as preventing data from being exposed using offline methods all play a key role. We will talk about various aspects of many of these in future posts!


The availability of healthcare-related information is it pertains to the consumer is a much larger subject than it has ever been. We have regulations such as HIPAA that govern how and where we are able to share and make data available. We have electronic medical records systems (EMR) that allow providers to share patient information. We have consumer-facing, internet-enabled technologies that allow patients to interact with caregivers from the comfort of their mobile device (or really, from anywhere). It’s an exciting time to be involved in healthcare IT, and there is no shortage of problems to solve. In my next couple of posts, I’m going to talk about protecting both data-at-rest and data-in-motion, so I want you to think about how these problems affect you if you’re in a healthcare environment (and feel free to speculate and bounce ideas off the forum walls even if you’re not). I would love to hear the challenges you face in these areas and how you’re going about solving them!


As mentioned above, I hope to turn this series into a dialogue of sorts. Share your thoughts and ideas below -- especially if you work in healthcare IT -- so we can take this discussion further.

In Austin this week for some THWACKcamp filming. There's a lot of reasons to enjoy Austin, even in August, but being able to sit and talk with my fellow Head Geeks is by far the best reason.


Here's a bunch of links from the intertubz you might find interesting. Enjoy!


App sizes are out of control

I've been frustrated about this situation ever since I accidently started an update to an app while in another country and wasn't connected to Wi-Fi.


UK Writes GDPR into Law with New Data Protection Bill

Here's hoping that this is the start of people understanding that data is the most valuable asset they own.


Half of US Consumers Willing to Trade Data for Discounts

Then again, maybe not.


Will Blockchain End Poverty?



How a fish tank helped hack a casino

I know that IoT is a security nightmare and all, but hackers may want to think twice about the people they steal from.


“E-mail prankster” phishes White House officials; hilarity ensues

This just shows that the folks in the White House are just as gullible as everyone else.


What’s in the path of the 2017 eclipse?

Interactive map showing you the path for the upcoming eclipse.


Event Season is starting and I may need to find a new place for all my conference badges.

Whatever business one might choose to examine, the network is the glue that holds everything together. Whether the network is the product (e.g. for a service provider) or simply an enabler for business operations, it is extremely important for the network to be both fast and reliable.


IP telephony and video conferencing have become commonplace, taking communications that previously required dedicated hardware and phone lines and moving them to the network. I have also seen many companies mothball their dedicated Storage Area Networks (SANs) and move them closer to Network Attached Storage, using iSCSI and NFS for data mounts. I also see applications utilizing cloud-based storage provided by services like Amazon's S3, which also depend on the network to move the data around. Put simply, the network is critical to modern companies.


Despite the importance of the network, many companies seem to have only a very basic understanding of their own network performance even though the ability to move data quickly around the network is key to success. It's important to set up monitoring to identify when performance is deviating from the norm, but in this post, I will share a few other thoughts to consider when looking at why network performance might not be what people expect it to be.



MTU (Maximum Transmission Unit) determines the largest frame of data that can be sent over an ethernet interface. It's important because every frame that's put on the wire contains overhead; that is, data that is not the actual payload. A typical ethernet interface might default to a physical MTU of around 1518 bytes, so let's look at how that might compare to a system that offers an MTU of 9000 bytes instead.


What's in a frame?

A typical TCP datagram has overhead like this:


  • Ethernet header (14 bytes)
  • IPv4 header (20 bytes)
  • TCP header (usually 20 bytes, up to 60 if TCP options are in play)
  • Ethernet Frame Check Sum (4 bytes)


That's a total of 58 bytes. The rest of the frame can be data itself, so that leaves 1460 bytes for data. The overhead for each frame represents just under 4% of the transmitted data.


The same frame with a 9000 byte MTU can carry 8942 bytes of data with just 0.65% overhead. Less overhead means that the data is sent more efficiently, and transfer speeds can be higher. Enabling jumbo frames (frames larger than 1500 bytes) and raising the MTU to 9000 if the hardware supports it can make a huge difference, especially for systems moving a lot of data around the network, such as the Network Attached Storage.


What's the catch?

Not all equipment supports a high MTU because it's hardware dependent, although most modern switches I've seen can handle 9000-byte frames reasonably well. Within a data center environment, large MTU transfers can often be achieved successfully, with positive benefits to applications as a result.


However, Wide Area Networks (WANs) and the internet are almost always limited to 1500 bytes, and that's a problem because those 9000-byte frames won't fit into 1500 bytes. In theory, a router can break large packets up into appropriately sized smaller chunks (fragments) and send them over links with reduced MTU, but many firewalls are configured to block fragments, and many routers refuse to fragment because of the need for the receiver to hold on to all the fragments until they arrive, reassemble the packet, then route it toward its destination. The solution to this is PMTUD (Path MTU Discovery). When a packet doesn't fit on a link without being fragmented, the router can send a message back to the sender saying, It doesn't fit, the MTU is... Great! Unfortunately, many firewalls have not been configured to allow the ICMP messages back in, for a variety of technical or security reasons, but with the ultimate result of breaking PMTUD. One way around this is to use one ethernet interface on a server for traffic internal to a data center (like storage) using a large MTU, and another interface with a smaller MTU for all other traffic. Messy, but it can help if PMTUD is broken.


Other encapsulations

The ethernet frame encapsulations don't end there. Don't forget there might be an additional 5 bytes required for VLAN tagging over trunk links, VXLAN encapsulation (50 bytes) and maybe even GRE or MPLS encapsulations (4 bytes each). I've found that despite the slight increase in the ratio of overhead to data, 1460 bytes is a reasonably safe MTU for most environments, but it's very dependent on exactly how the network is set up.



I had a complaint one time that while file transfers between servers within the New York data center were nice and fast, when the user transferred the same file to the Florida data center (basically going from near the top to the bottom of the Eastern coast of the United States) transfer rates were very disappointing, and they said the network must be broken. Of course, maybe it was, but the bigger problem without a doubt was the time it took for an IP packet to get from New York to Florida, versus the time it takes for an IP packet to move within a data center.


AT&T publishes a handy chart showing their current U.S. network latencies between pairs of cities. The New York to Orlando current shows that it has a 33ms latency, which is about what we were seeing on our internal network as well. Within a data center, I can move data in a millisecond or less, which is 33 times faster. What many people forget is that when using TCP, it doesn't matter how much bandwidth is available between two sites. A combination of end-to-end latency and congestion window (CWND) size will determine the maximum throughput for a single TCP session.


TCP session example

If it's necessary to transfer 100,000 files from NY to Orlando, which is faster:


  1. Transfer the files one by one?
  2. Transfer ten files in parallel?


It might seem that the outcome would be the same because a server with a 1G connection can only transfer 1Gbps, so whether you have one stream at 1Gbps or ten streams at 100Mbps, it's the same result. But actually, it isn't because the latency between the two sites will effectively limit the maximum bandwidth of each file transfer's TCP session. Therefore, to maximize throughput, it's necessary to utilize multiple parallel TCP streams (an approach taken very successfully for FTP/SCP transfers by the open source FileZilla tool). It's also the way that tools like those from Aspera can move data faster than a regular Windows file copy.


The same logic also applies to web browsers, which typically will open five or six parallel connections to a single site if there are sufficient resource requests to justify it. Of course, each TCP session requires a certain amount of overhead for connection setup. Usually a three-way handshake, and if the session is encrypted there may be a certificate or similar exchange to deal with as well. Another optimization that is available here is pipelining.



Pipelining uses a single TCP connection to issue multiple requests back to back. In HTTP protocol, this is accomplished by the HTTP header Connection: keep-alive, which is a default in HTTP/1.1. This request asks the destination server to keep the TCP connection open after completing the HTTP request in case the client has another request to make. Being able to do this allows the transfer of multiple resources with only a single TCP connection overhead (or, as many TCP connection overheads as there are parallel connections). Given that a typical web page may make many tens of calls to the same site (50+ is not unusual), this efficiency stacks up quite quickly. There's another benefit too, and that's the avoidance of TCP slow start.


TCP slow start

TCP is a reliable protocol. If a datagram (packet) is lost in transit, TCP can detect the loss and resend the data. To protect itself against unknown network conditions, however, TCP starts off each connection being fairly cautious about how much data it can send to the remote destination before getting confirmation back that each sent datagram was received successfully. With each successful loss-free confirmation, the sender exponentially increases the amount of data it is willing to send without a response, increasing the value of its congestion window (CWND). Packet loss causes CWND to shrink again, as does an idle connection during which TCP can't tell if network conditions changed, so to be safe it starts from a smaller number again. The problem is, as latency between endpoints increases, it takes progressively longer for TCP to get to its maximum CWND value, and thus longer to achieve maximum throughput. Pipelining can allow a connection to reach maximum CWND and keep it there while pushing multiple requests, which is another speed benefit.



I won't dwell on compression other than to say that it should be obvious that transferring compressed data is faster than transferring uncompressed data. For proof, ask any web browser or any streaming video provider.


Application vs network performance

Much of the TCP tuning and optimization that can take place is a server OS/application layer concern, but I mention it because even on the world's fastest network, an inefficiently designed application will still run inefficiently. If there is a load balancer front-ending an application, it may be able to do a lot to improve performance for a client by enabling compression or Connection: keep-alive, for example, even when an application does not.


Network monitoring

In the network itself, for the most part, things just work. And truthfully, there's not much one can do to make it work faster. However, the network devices should be monitored for packet loss (output drops, queue drops, and similar). One of the bigger causes of this is microbursting.



Modern servers are often connected using 10Gbps ethernet, which is wonderful except they are often over-eager to send out frames. Data is prepared and buffered by the server, then BLUURRRGGGGHH it is spewed at the maximum rate into the network. Even if this burst of traffic is relatively short, at 10Gbps it can fill a port's frame buffer and overflow it before you know what's happened, and suddenly the latter datagrams in the communication are being dropped because there's no more space to receive them. Anytime the switch can't move the frame from input to output port at least as fast as it's coming in on a given port, the input buffer comes into play and puts it at risk of getting overfilled. These are called microbursts because a lot of data is sent over a very short period. Short enough, in fact, for it to be highly unlikely that it will ever be identifiable in the interface throughput statistics that we all like to monitor. Remember, an interface running between 100% for half the time and 0% for the rest will likely show up as running at 50% capacity in a monitoring tool. What's the solution? MOAR BUFFERZ?! No.


Buffer bloat

I don't have space to go into detail here, so let me point you to a site that explains buffer bloat, and why it's a problem. The short story is that adding more buffers in the path can actually make things worse because it actively works against the algorithms within TCP that are designed to handle packet loss and congestion issues.


Monitor capacity

It sounds obvious, but a link that is fully utilized will lead to slower network speeds, whether through higher delays via queuing, or packet loss leading to connection slowdowns. We all monitor interface utilization, right? I thought so.


The perfect network

There is no perfect network, let's be honest. However, having an understanding not only of how the network itself (especially latency) can impact throughput, as well as an understanding of the way the network is used by the protocols running over it, might help with the next complaint that comes along. Optimizing and maintaining network performance is rarely a simple task, but given the network's key role in the business as a whole, the more we understand, the more we can deliver.


While not a comprehensive guide to all aspects of performance, I hope that this post might have raised something new, confirmed what you already know, or just provided something interesting to look into a bit more. I'd love to hear your own tales of bad network performance reports, application design stupidity, crazy user/application owner expectations (usually involving packets needing to exceed the speed of light) and hear how you investigated and hopefully fixed them!

By Joe Kim, SolarWinds EVP, Engineering and Global CTO


With hybrid IT on the rise, I wanted to share a blog written earlier this year by my SolarWinds colleague Bob Andersen.


Hybrid IT – migrating some infrastructure to the cloud while continuing to maintain a significant number of applications and services onsite – is a shift in the technology landscape that is currently spreading across the federal government. Are federal IT professionals ready for the shift to this new type of environment?


Government IT pros must arm themselves with a new set of skills, products, and resources to succeed in the hybrid IT era. To help with this transition, we have put together a list of four tips that will help folks not only survive, but thrive within this new environment.


#1: Work across department silos.


Working across department silos will help speed up technology updates and changes, software deployments, and time-to-resolution for problems. What is the best way to establish these cross-departmental relationships? A good place to start is by implementing the principles of a DevOps approach, where the development and operations teams work together to achieve greater agility and organizational efficiency. DevOps, for example, sets the stage for quick updates and changes to infrastructure, which makes IT services – on-premises or within the cloud – more agile and scalable.


#2: Optimize visibility with a single version of the truth.


In a hybrid environment, the federal IT pro must manage both on-premises and cloud resources. This can present a challenge. The solution? Invest in a management and monitoring toolset that presents a single version of the truth across platforms. There will be metrics, alerts, and other collected data coming in from a broad range of applications, regardless of their location. Having a single view of all this information will enable a more efficient approach to remediation, troubleshooting, and optimization.


#3: Apply monitoring. Period.


Monitoring has always been the foundation of a successful IT department. In a hybrid IT environment, monitoring is absolutely critical. A hybrid environment is highly complex. Agencies must establish monitoring as a core IT function; only then will they realize the benefit of a more proactive IT management strategy, while also streamlining infrastructure performance, cost, and security.


#4: Improve cloud-service knowledge and skills.


As more IT services become available through the cloud – and, in turn, through cloud providers – it  becomes increasingly important for the federal IT pro to fully understand available cloud services. It’s also important to understand how traditional and cloud environments intersect. For example, service-oriented architectures, automation, vendor management, application migration, distributed architectures, application programming interfaces, hybrid IT monitoring and management tools, as well as metrics. Knowledge across boundaries will be the key to success in a hybrid IT environment.


Working through a technology shift is never easy, especially for the folks implementing, managing, and maintaining all the changes. That said, by following the above tips, agencies will be able to realize the benefits of a hybrid cloud environment, while the IT team thrives within the new environment.


Find the full article on Government Computer News.

Filter Blog

By date: By tag: