1 2 3 4 5 Previous Next

Geek Speak

2,549 posts

Data, data, data. You want all of the data, right? Of course you do. Collecting telemetry and logging data is easy. We all do it and we all use it from time to time. Interrupt-driven networking is a way of life and is more common than any other kind (i.e., automation and orchestration-based models) because that is how the vast majority of us learned. “Get it working, move to the next fire.” What if there were more ways to truly understand what is happening within our sphere of control? Well, I believe there is -- and the best part is that the barrier of entry is pretty low, and likely right in your periphery.

 

 

Once you have all of that data, the next step is to actually do something with it. All too often, we as engineers poll, collect, visualize, and store a wealth of data and are only in rare occasions actually leveraging even a fraction of the potential it can provide. In previous posts, we touched on the usefulness of correlation of collected and real-time data. This will take that a step further. It should be noted that this is not really intended to be a tutorial, but instead more of a blueprint or, more accurately, a high-level recipe that may have rotating and changing ingredients list. We all like different flavors and levels of spiciness, right?

 

 

As noted in the previous post on related attributes, there is a stealthy enemy in our network -- a gremlin, if you will. That gremlin's name is “grey failure,” and is very hard to detect, and even more difficult to plan around. Knowing this, and realizing that there is a large amount of data that has related attributes, similar causes, and noticeable effects, we can start to build a framework to aid in this endeavor.  We talked about the related attributes of SNMP and NetFlow. Now, let us expand that further into the familial brethren of interface errors and syslog data.

 

 

While syslog data may be a wide, wide net, there are some interesting bits and pieces we can glean out of even the most rudimentary logs. Interface error detection will manifest in many ways depending on the platform in use. There may be logging mechanisms for this. It may come as polled data. It could possibly reveal itself as an SNMP trap. The mechanism isn’t really important. However, having the knowledge to understand that a connection is causing an issue with an application is critical. In fact, the application may be a key player in discovery of an interface issue. Let’s say that an application is working one day and the next there are intermittent connectivity issues. If the lower protocol is TCP, it will be hard to run down without packet capture because of TCP retransmissions. If, however, this application generates connectivity error logs and sends them to syslog, then that can be an indicator of an interface issue. From here it can be ascertained that there is a need to look at a path, and the first step of investigating a path is looking at interface errors. Here is the keystone, though. Simply looking at interface counter on a router can uncover incrementing counters, but, looking at long-term trends will make such an issue very obvious. In the case of UDP, this can be very hard to find since UDP is functionally connectionless. This is where the viewing the network as an ecosystem (as described in a previous blog post) can be very useful. Application, system, network, all working together in symbiosis. Flow data can help uncover these UDP issues, and with the help of the syslog from an appropriately built application, the job simply becomes a task of correlation.

 

 

Eventually these tasks will become more machine-driven, and the operator and engineer will only need to feed data sources into a larger, smarter, more self-sustaining (and eventually self-learning) operational model. Until then, understanding the important components and relations between them will only make for a quieter weekend, a more restful night, and a shorter troubleshooting period in the case of an issue. 



Systems monitoring has become a very important piece of the infrastructure puzzle. There might not be a more important part of your overall design than having a good systems monitoring practice in place. There are good options for cloud hosted infrastructures, on-premises, and hybrid designs. Whatever situation you are in, it is important that you choose a systems monitoring tool that works best for your organization and delivers the metrics that are crucial to its success. When the decision has been made and the systems monitoring tool(s) have been implemented, it’s time to look at the best practices involved in ensuring the tool works to deliver all it is expected to for the most return on investment.

 

 

The term “best practice” has known to be overused by slick salespeople the world over; however, there is a place for it in the discussion of monitoring tools. The last thing anyone wants to do it purchase a monitoring tool and install it just for it to slowly die and become shelfware. So, let’s look at what I consider to be the top 5 best practices for systems monitoring. 

 

1. Prediction and Prevention              

We’ve all heard the adage that “an ounce of prevention is worth a pound of cure.”  Is your systems monitoring tool delivering metrics that help point out where things might go wrong in the near future? Are you over-taxing your CPU? Running out of memory? Are there networking bottlenecks that need to be addressed? A good monitoring tool will include a prediction engine that will alert you to issues before they become catastrophic. 

 

2. Customize and Streamline Monitoring        

As an administrator, when tasked with implementing systems monitoring, it can bring lots of anxiety and visions of endless, seemingly useless emails filling up your inbox. It doesn’t have to be that way. The admin needs to triage what will trigger an email alert and customize the reporting accordingly. Along with email alerts, most tools allow you to create custom dashboards to monitor what is most important to your organization. Without a level of customization involved, systems monitoring can quickly become an annoying, confusing mess.

 

3. Include Automation

Automation can be a very powerful tool, and can save the administrator a ton of time. In short, automation makes life better, so long as it’s implemented correctly. Many tools today have an automation feature where you can either create your own automation scripts or choose from a list of common, out-of-the-box automation scripts. This best practice goes along with the first one in this list, prediction and prevention. When the tool notices that a certain VM is running out of space, it will reach back to vCenter and add more memory before it’s too late, assuming it has been configured to do so. This makes life much easier, but proceed with caution, as you don’t want your monitoring tool doing too much. It’s easy to be overly aggressive with automation. 

 

4. Documentation Saves the Day

Document, document, document everything you do with your systems monitoring tool. The last thing you want is to have an alert come up and the night shift guy on your operations team not know what to do with it. “Ah, I’ll just acknowledge the alarm and reset it to green, I don’t even know what IOPS are anyways.” Yikes! If you have a “run book” or manual that outlines everything about the tool, where to look for alerts, who to call, how to log in, and so on, then you can relax and know that if something goes wrong, you can rely on the guy with the manual to know what to do. Ensure that you also track changes to the document because you want to monitor what changes are being made and check that they are legit, approved changes.

 

5. Choose Wisely

Last, but definitely not least, pick the right tool for the job. If you migrated your entire workload to the cloud, don’t mess around with an on-premises solution for systems monitoring. Just let the cloud provider use their proprietary tool and run with it. That being said, get educated on their tool and make sure you can customize it to your liking. Don’t pick a tool based on price alone. Shop around and focus on the options and customization you can do with the tool. Always choose a tool that achieves your organization's goals in systems monitoring. The latest isn’t always the greatest.

 

Putting monitoring best practices in place is a smart way to approach a plan to help ensure your tool of choice is going to perform its best and give you the metrics you need to feel good about what’s going on in your data center.

Back from the beach and back in the saddle bringing you this week’s Actuator. I’m also gearing up for VMworld, which is coming up fast! If you are attending, let me know, as I’ll be there to work the booth and deliver a session titled "Performance Deep Dive for Demanding Virtual Database Servers."

 

As always, here are some links from the Intertubz that I hope will hold your interest. Enjoy!

 

The Defense Department has produced the first tools for catching deepfakes

Deliberate misinformation has been around since humans learned to communicate. For more than a thousand years, we have tried to discover what information is real, and what is fake. This “arms race” isn’t new, it’s as old as human civilization.

 

#1 Microsoft Widens Lead Over #2 Amazon In Cloud Revenue

I love that Azure can make a case for being a leader over AWS in some areas of cloud computing. What I love even more is that after years of telling you that there are only two cloud providers (AWS and Azure), everyone else is finally starting to see that truth as well.

 

Apple hangs onto its historic $1 trillion market cap

The reason this is so amazing is the fact that Apple is a company with essentially one product, the iPhone. But with this leverage, they could buy Alphabet, and then create a third player in the Cloud Wars.

 

While Music Streaming Sales Surge, Singers Still Get Paid a Song

Something tells me that artists have been getting less than their fair share for a long, long time.

 

Blue Apron shares sink as customers ditch its meal-kits

We tried Blue Apron a while back and were underwhelmed by the service. Now that grocery stores have upped their game, companies like Blue Apron need to bring something else to the table otherwise they won’t be invited for dinner much longer.

 

A collection of dataviz caveats

Brilliant summary of things not to do with data visualizations. If you read this, and still use a pie chart as a default, shame on you.

 

Blockchain, Once Seen as a Corporate Cure-All, Suffers Slowdown

Blockchain is at best a very slow distributed database, a glorified linked list. But here’s what Blockchain is not: a security protocol. While Blockchain will provide the ability to verify the authenticity of a transaction, the fact is you could be conducting a transaction with Satan. As companies start to figure this out, the hype pushing the Blockchain Train seems to be losing steam.

 

I could get used to life at the beach:

In the previous blog in this series, we reviewed several types of attacks and threats, and some ways they are perpetrated. In this blog, we will become familiar with several methodologies that can be part enterprise protection plan.

 

 

Let’s first clarify “protection.” There is no silver bullet for preventing all attacks. Threats evolve with the ever-changing world that is IT. There is a cliché in the industry today: “It’s not a matter of if you are compromised, it’s a matter of when.” Even though it may seem like a daunting task to protect and detect in a dynamic threat landscape, it is still considered fundamental to define and deploy foundational security best practices and controls that become the first line of defense for an organization. Many of these methods require a security policy that forces security professionals to discover, audit, and understand their environment. A hacker will spend time surveying a potential target; if you can’t stop the attack, you must at least be able to detect and contain, and this isn’t possible if the placement, role, and configuration of the network and its assets are not well-defined. Also, in the event of attack, even a failure of your protection methods can be used during incident response and remediation planning.

 

Whenever a new asset or entity is added to a network, its role and access control levels should be clearly defined and fall into one of the following.

 

Discretionary Access Control (DAC): A security access control that authorizes object access via an access control policy that requires supplied credentials during authentication, such as username and password. This type of authorization is discretionary because the owner/admin determines object access privileges for each user. An example is an access control list (ACL) authorization based on user identification. A security check on this type of access control is commonly a limit on the number of failed authentications.

 

Mandatory Access Control (MAC): A set of specific security policies defined according to system classification, configuration, and authentication. MAC is characterized by the centralized enforcement of confidential security policy parameters under the control of identified system administrators. Because a MAC is so well-defined and policed, its policies reduce security errors and establish an action/owner audit trail in the event of an incident.

 

Non-discretionary Access Control: A means of access control where access is not explicitly mapped to a specific user. Instead, it is wider in scope and can be based on a set of rules, privileges, or roles to provide access. Some examples are a role-based access control (RBAC) that grants access to an admin login, and access to certain systems and applications during business hours only using a time-based ACL.

 

Once you’ve determined how an asset is accessed, the next step is planning a management lifecycle for that asset. Here are some key considerations.

 

Information Technology Asset Management (ITAM): Includes activities such as the tracking of software licensing, upgrades, and installations, as well as tracking actions and logon locations to provide an up-to-date timeline of asset state and usage. Missed updates can flag an asset for quarantine or restricted access. Making sure that all security updates are quickly performed and software version control can mitigate risk.

 

Configuration Management: A process-oriented and best practices approach for handling changes to a system in such a way that it maintains integrity over time. It usually employs automation through scripting or an orchestration application to uniformly apply changes to all systems, reducing the time required for updates and the possibility of introducing errors. In the event of a breach, a centralized configuration console can quickly shut down several systems until a remediation can be pushed out.

 

Patch Management: A strategy for managing patches or upgrades for software applications and technologies. A patch management plan can help a business or organization handle these changes efficiently. It is important that admins and stakeholders are well informed when it comes to patchable vulnerabilities through advisories from vendors. Success here depends on knowing which applications and versions are deployed on your assets and having a strategy to contain systems that have yet to be patched.

 

Vulnerability Management: The process in which vulnerabilities are identified and their risks evaluated. This evaluation leads to either removing the risk or accepting it based on an analysis of the impact of an attack versus the cost of correction and possible damages to the organization. Keeping abreast of the latest vulnerabilities that affect an organization requires the tracking of vendor-issued vulnerability notices as well as those advisories issued by industry groups such as PSIRT. These advisories offer more information about potential impacts as well as interim workarounds in cases where an update is not yet released or will take time to be deployed.

 

When something goes wrong, a well-defined security policy in terms of access and controls will help in the discovery and mitigation process. It’s often that forgotten unpatched server or a group of users with vulnerable applications that leave an organization open to potential threats. Be aware of your surroundings.

 

In the next blog, we will look at protection methods that are geared to some specific threats and also touch on how data science is becoming an important tool in the cybersecurity space.

By Paul Parker, SolarWinds Federal & National Government Chief Technologist

 

It turns out that the lowest hanging fruit for hackers comes from user-generated passwords. According to the Verizon® 2017 Data Breach Investigation Report, 81% of hacking-related breaches were the result of a weak or stolen password.

 

What does this mean for federal agencies? It means that along with creating a sound security posture through a solid foundation of processes and tools, password security should be top of mind.

 

Creating a Solid Password

 

Users tend to create short, simple passwords or reuse passwords across multiple accounts. Or, they resort to common strategies like switching out every “a” for a “4,” every “e” for a “3,” and so on. The challenge here is that humans are not the ones guessing passwords; humans use machines to guess passwords. So, while the letter-replacement strategy may be difficult for humans to figure out, it’s simple for a computer.

 

What’s the solution, then? How can a federal IT security pro help ensure users create stronger passwords?

 

The National Institute of Standards and Technology (NIST) has been working for several years to provide updated rules and regulations for protecting digital identities. NIST published these new rules in June 2017. The overall theme of NIST’s guidance on passwords in particular is to keep it simple. Let users create long, easy-to-remember passwords without the complexity of special characters, and uppercase and lowercase letters. The use of a “pass-phrase” instead of a “password” is a key component to alignment with the new NIST recommendation.

 

Within the overall guidance, NIST provides the following basic guidelines that every agency can follow specifically for creating and protecting passwords.

 

First, do not rely on passwords alone for protection. Be sure end-users are taking advantage of all possible methods of protecting security—such as implementing multi-factor authentication.

 

Next, train users to have a better understanding of what a strong password looks like. Having a combination of uppercase and lowercase letters, numbers, and symbols is old thinking. A phrase with multiple unrelated words is a far better choice.

 

Ask users to adopt a passphrase password that would be difficult to hack based on its length and random combination of words, but can be easy to remember through a visual cue.

 

Third, be sure users are using different passwords for different accounts (banking, email, etc.). It is incredibly common for users to have the same password for multiple things; this is highly insecure and should be just as highly discouraged. Their government network password should not be the same one that they use in everyday life. This can limit the exposure should a breach occur.

 

Finally, encourage users to consider implementing a password management solution. A password manager generates and stores all user passwords—and any other security-related information, such as PINs, credit card numbers, or CVV codes—across all online accounts, in a single location. With a password manager, users need only remember one password. Easy.

 

In our federal environments, we aren’t lucky enough to simply grab a best-in-breed commercial password management solution. System architects and engineers should consider a business case for privileged access and password management at an enterprise level. There are many robust and approved ways to help keep the systems safe and secure. Hackers are creative, and IT teams should be too.

 

Creating a Foundation for Solid Passwords

 

While creating the password itself is ultimately the user’s responsibility, there are things that federal IT security pros can do. Start with the NIST guidance, ensure that your agency-specific policy is up to date, and implement proper controls and solutions to meet the established goals. Beyond password creation and protection, federal IT security pros should work with internal security teams to regularly scan the network and ensure proper compliance.

 

Be sure to have a solid security foundation, routine security awareness training, and implement testing and validation processes often as possible. Reducing your exposure and being proactive in addressing weakness will make your agency a far more difficult and less appealing target.

 

Find the full article on our partner DLT’s blog Technically Speaking.

If you work in engineering or support, you've probably spent a lot of time troubleshooting things. You've probably spent just as much time trying to figure out why things were broken in the first place. As much as we might like to think about things being simple when it comes to IT troubleshooting, the fact of the matter is that most of the time the problems are so complex as to be almost impossible to solve at first glance.

The real thing we're looking for here is root cause analysis. It's a fancy term for "find out what caused things to break." What root cause analysis is focused on is a proven, repeatable methodology for determining the root cause of the problem. And the process is deceptively simple: if you remove a symptom and the problem doesn't happen, it's not part of the root cause. How can we do root cause analysis on problems that are organization-wide or that have so many component factors as to make it difficult to isolate? That's where the structure comes into place.

Step One: Do You Have A Problem?

It may sound silly, but to do root cause analysis on a problem, first you have to figure out if you have a problem. I originally talked about problem determination when I first started writing, as it was one of the biggest issues I saw in the field. People can't do problem determination. They can't figure out if something isn't behaving properly unless there is an error message.

Problems have causes. Most of the time they are periodic or triggered. Rarely, they may appear to be random but are, in fact, just really, really oddly periodic. To determine root cause, you first must figure out that the thing you are looking at is a problem. Is it something that is happening by design of the protocol or the implementation? Is it happening because of environmental factors or other external sources? You're going to be mighty upset if you spend cycles troubleshooting what you think is a failing power supply only to find out someone keeps shutting off the power to the room and causing the outage.

Problems also need to be repeatable. If something can't be triggered or observed on a schedule, you need to dig further until you can make it happen. Random chance isn't a problem. Cosmic rays causing data loss isn't something you can replicate easily. Real problems that can be solved with root cause analysis can be repeated until they are resolved.

Step Two: Box Your Problem

The next step in the troubleshooting process is the part we're the most familiar with: the actual troubleshooting. I wrote about my troubleshooting process years ago. I just start determining symptoms and isolating them until I get to the real problem. Sometimes that means erasing those boxes and redrawing them. You can't assume that any one solution will be the right one until you can determine for a fact that it solves the root cause.

This is where a lot of people tend to get caught up in the euphoria of troubleshooting. They like solving problems. They like seeing something get fixed. What they don't like is finding out their elegant solution didn't work. So, they'll often stop when they've drawn a pretty box around the issue and isolated it. Deal with things until you don't have to deal with them any longer. But with root cause analysis, you have to keep digging. You have to know that your process fixes the issue and is repeatable.

When I worked for Gateway 2000, every call we took had to follow the ARC method of documentation: steps to ADDRESS the issue, RESOLUTION of the issue, reason for the CALL. I always thought it should have been CAR - CALL, ADDRESS, RESOLUTION, but I kept getting overruled. We loved filling in C and R: why did they call and what eventually fixed it. What we didn't do so well was the middle part. Things don't get fixed by magic. You need to write down every step along the way and make sure that the process you followed fixes the problem. If you leave out the steps, you'll never know what fixed things.

Step Three: Make Sure You Really Fixed It

This is the part of root cause analysis that most people really hate. Not only to you have to prove you fixed the thing, but you also have to prove that the steps you took fixed it. Like determining the root cause above, if one of your steps didn't fix the problem, you have to eliminate it from the root cause analysis as being irrelevant.

Think about it like this. If you successfully solve a problem by kicking a server and fixing DNS, what actually fixed the issue? Root cause analysis says you have to try both solutions next time you're presented with the same issue. It's very likely that DNS fixes were the real solution and the root cause was DNS misconfiguration. But you can't discount the kick until you can prove it didn't fix the issue. Maybe you jostled a fan loose and made the CPU run cooler?

We have a real problem with isolating issues. Sometimes that means that when we change a setting and it doesn't fix the problem, we need to change it back. That sounds counter-intuitive until you realize that making fourteen changes until you find the right setting to fix the issue means you're not really sure which one solved the problem. That means you have to isolate everything to make sure that Solution Nine wasn't really the right one and it just took 30 minutes to kick in while you tried Solutions Ten through Fourteen.

Once you know that you fixed the issue and that this particular solution or solution path fixed the issue, you've successfully completed the majority of your root cause analysis. But you're not quite done yet.

Step Four: Blameless Reporting

This is a hard one. You need to do a report about the root cause. But what if the cause is something someone changed or did that made the issue come up? How do you do a report without throwing someone under the bus?

Fix the problem, not the blame. You can't have proper root cause analysis if the root cause is "David." People aren't the cause of issues. David's existence didn't cause the server to reboot. David's actions caused it. Focus on the actions that caused the problem and the resolution. Maybe it's as simple as revoking reboot rights from the group that David and other junior admins belong to. Maybe the root cause really is that David was mad at management and just wanted to reboot a production server to make them mad. But you have to focus on the actions and not the people. Blaming people doesn't solve problems. Correcting actions does.

Root cause analysis isn't easy. It's designed to help people get to the bottom of their problems with a repeatable process every time. If you follow these steps and make sure you're honest with yourself along the way, you'll quickly find that your problems are getting resolved faster and more accurately. You'll also find that people are more willing to quickly admit mistakes and get them rectified if they know the whole thing isn't going to come down on their head. And a better overall environment for work means less problems for everyone else to have to solve.

The technology industry is about technology. Technology is the fastest changing industry there is, but the often-overlooked part of this industry is the humans that work with the technology. Humans are not robots, even though some may feel or think that we are. There are many facets to humans. We are the most complex machine there is, yet we have the most difficult time understanding and taking care of ourselves. For this post I’m going to discuss two things about us humans in regards to working in tech. Both of these topics are hot items to discuss now as we are realizing that it takes much more than showing up to work to be successful in today’s world. These two pillars are the soft skills needed for the tech industry and maintain work-life balance.

Why Are Soft Skills Important?

Soft skills are something that’s needed for everyone to be successful in almost any job. These are different from hard skills because they are not typically attained through any formal education, training programs, or certifications. They are the interpersonal skills that are somewhat harder to define and evaluate, unlike a skill needed to deploy a highly complex set of systems.

We need to develop these skills to interact with each other. Some people are better at it than others. Long gone are the days of the IT guy/gal sitting in a dark cubicle that doesn’t interact with the rest of the company. IT professionals in today’s age must interact with business units, customers, partners, and especially each other within a team. It’s important to develop these soft skills so that you can work amongst each other and collaborate effectively. Soft skills are also the type of skills that can be transferable to any career. Keep in mind it takes time to develop them. Some key soft skills are communication and teamwork.

Communication – This includes listening, verbal, and written communication. The tones of or voices and how they are written can be interpreted differently by many people, affecting how we are perceived. Listening is so important because oftentimes we are trying to form a response to someone without actually listening to what the other person has to say. Taking a step back and carefully listening to someone while they speak helps us truly understand what they are saying.

Teamwork – Enough can’t be said about teamwork. There is no “I” in team. You need to be able to work with others around you even if you don’t agree with them. Being able to negotiate with others is important because we can’t always have our way as much as we’d like to. It’s give and take. 

Everybody Needs Work-Life Balance

I’m going to start off by saying I am by no means a mental health professional, but what I can attest to is that I am a recovering burnout IT professional. Like soft skills, handling stress and maintaining balance in your life comes in different ways for each person. What is clear is that everybody needs it otherwise you will get burnt out. If you need more details on burnout, the Mayo Clinic has a great article written about Job Burnout.

Maintaining balance is critical in the IT industry, as it is often a very stressful job with extremely long hours and crazy demands. A lot of us work on-call and when there are issues, you can be working 24-36 hours straight with very few breaks. The occasional long hours are usually not an issue. It becomes an issue when they are repeatedly done. We all need sleep, some more than others, but we still need a break. Our brains need to shut off and take some time to recoup to be refreshed.

The relationships, our families and loved ones, are affected by how we work. Working 24/7 does not help your family even though we may have in our minds we are working to provide for our family. We are no good to them if we are not engaged and present with them. Constantly working and not disengaging has long-lasting effects. This is something I am all too familiar with and work on improving every day. The saying, “Work to Live, Don’t Live to Work,” is so true. There is no time machine. You can’t go back in time for missing out on special occasions or memories made. Take the time needed for family and yourself.

Having balance in our lives keeps up healthy. We keep ourselves in a good state of mind and it helps with burnout. We need this to be successful people in our personal and professional lives, otherwise we are just like zombies walking around.

 

 

Red and black letter tiles spelling out data protection terms on a black background

As I explained in previous posts on building a culture of data protection, we in the technology world must embrace data protection by design:

 

To Reward, We Must Measure. How do we fix this?  We start rewarding people for data protection activities. To reward people, we need to measure their deliverables.

 

An enterprise-wide security policy and framework that includes specific measures at the data category level:

    • Encryption design, starting with the data models
    • Data categorization and modeling
    • Test design that includes security and privacy testing
    • Proactive recognition of security requirements and techniques
    • Data profiling testing that discovers unprotected or under-protected data
    • Data security monitoring and alerting
    • Issue management and reporting

 

Traditionally, we relied on security features embedded in applications to protect our data. But in modern data stories, data is used across many applications and end-user tools. This means we must help ensure our data is protected as close as possible to where it persists. That means in the database.

 

Data Categorization

 

Before we can properly protect data, we have to know what data we steward and what protections we need to give it. That means we need a data inventory and a data categorization/cataloging scheme. There are two ways that we can categorize data: syntactically and semantically.

 

When we evaluate data items syntactically, we look at the names of tables and columns to understand the nature of data. For this to be even moderately successful, we must have reliable and meaningful naming standards. I can tell you from my 30+ years of looking at data architectures that we aren't good at that. Tools that start here do 80% of the work, but it's that last 20% that takes much more time to complete. Add to this the fact that we also do a shameful job of changing the meaning of a column/data item without updating the name, and we have a lot of manual work to do to properly categorize data.

 

Semantic data categorization involves looking at both item names and actual data via data profiling. Profiling data allows us to examine the nature of data against known patterns and values. If I showed you a column of fifteen to sixteen digit numbers that all had a first character of three, four, five, or six, you'd likely be looking at credit card data. How do I know this? Because these numbers have an established standard that follow those rules. Sure, it might not be credit card numbers. But knowing this pattern means you know you need to focus on this column.

 

Ideally we'd use special tools to help us catalog our data items, plus we'd throw in various types of machine learning and pattern recognition to find sensitive data, record what we found, and use that metadata to implement data protection features.

 

Data Modeling

 

The metadata we collected and design during data categorization should be managed in both logical and physical data models.  Most development projects capture these requirements in user stories or spreadsheets. These formats make these important characteristics hard to find, hard to manage, and almost impossible to share across projects.

 

Data models are designed to capture and manage this type of metadata from the beginning. They form the data governance deliverables around data characteristics and design. They also allow for business review, commenting, iteration, and versioning of important security and privacy decisions.

 

In a model-driven development project, they allow a team to automatically generate database and code features required to protect data. It's like magic.

 

Encryption

 

As I mentioned in my first post in this series, for years, designers were afraid to use encryption due to performance trade-offs. However, in most current privacy and data breach legislation, the use of encryption is a requirement. At the very least, it significantly lowers the risk that data is actually disclosed to others.

 

Traditionally, we used server-level encryption to protect data. But this type of encryption only protects data at rest. It does not protect data in motion or in use. Many vendors have introduced end-to-end encryption to offer data security between storage and use. In SQL Server, this feature is called Always Encrypted.  It works with the .Net Framework to encrypt data at the column level and it provides the protection from disk to end use. Because it's managed as a framework, applications do not have to implement any additional features for this to work. I'm a huge fan of this holistic approach to encryption because we don't have a series of encryption/decryption processes that leave data unencrypted between steps.

 

There are other encryption methods to choose from, but modern solutions should focus on these integrated approaches.

 

Data Masking

 

Data masking obscures data at presentation time to help protect the privacy of sensitive data. It's typically not a true security feature because the data isn't stored as masked values, although they can be. In SQL Server, Dynamic Data Masking allows a designer to specify a standard, reusable mask pattern for each type of data. Remember that credit card column above? There's an industry standard for masking that data: all but the last four characters are masked with stars or Xs. This standard exists because the other digits in a credit card number have meanings that could be used to guess or social engineer information about the card and card holder.

 

Traditionally, we have used application or GUI logic to implement masks. That means that we have to manage all the applications and client tools that access that data. It's better to set a mask at the database level, giving us a mask that is applied everywhere, the same way.

 

There are many other methods for data protection (row level security, column level security, access permissions, etc.) but I wanted to cover the types of design changes that have changed recently to better protect our data. In my future posts, I'll talk about why these are better than the traditional methods.

This week's Actuator comes to you from the beach, where I am on vacation.

 

I'll be back next week with links to share. In the meantime, enjoy this picture of Sir Francis.

 

(And don't forget that THWACKcamp registration is now open!)

 

By Paul Parker, SolarWinds Federal & National Government Chief Technologist

 

Federal IT professionals know that practicing good information security (InfoSec) is a must, but instilling InfoSec habits into an IT culture is often easier said than done. Luckily, there are steps federal administrators can take to embed good InfoSec practices within their operations.

 

Build Security into the Community

 

Administrators should consider embedding security practices and conversations about good security habits within the daily office environment. For example, gamifying security training by using fun and engaging activities to convey an agency’s position on the importance of constant vigilance can help create a lasting, effective, and deep-seated security culture.

 

Implement Strong IT Controls

 

According to respondents of a recent Federal Cybersecurity Survey, agencies with evidence of strong IT controls are more likely to possess the hallmarks of strong InfoSec environments. They experience fewer threats and are able to respond more quickly to those that do occur. They also enjoy more positive results when implementing IT modernization initiatives, and are ready to comply with regulations, such as HIPAA and FISMA. These agencies are using a myriad of controls for configuration and patch management, web application security, file integrity monitoring, and, of course, security.

 

Building strong IT controls requires a deep level of visibility into one’s IT infrastructure, which network and application performance monitoring tools provide. They continuously collect data on operations and alert IT administrators to anomalies, such as lags in performance or intrusion attempts, providing constant and valuable insight into network activities.

 

Invest in Physical Security

 

A solid InfoSec posture involves protecting agencies from insider threats just as much as it does fortifying against external hackers. Indeed, 54 percent of respondents to the cybersecurity survey cited careless or untrained insiders as their top threats, with 40 percent designating “malicious insiders” as security concerns. The reality is that sizeable portions of security risks come from inside the house.

 

Monitoring and logging when someone accesses sensitive data can allow managers to trace breaches back to their sources and discourage malicious insiders. Additionally, video surveillance of areas like data centers can dissuade potential breaches. Consider video analytics tools to help identify patterns and anomaly events, which can help identify the source of, or even prevent, potential breaches.

 

Consider Investing in Security Consultants

 

With so much at stake, it pays to have an experienced professional around whose primary goal is finding holes in an agency’s security. Outside security consultants can bring a fresh perspective to the status of an agency’s security posture, and are well versed in testing, reviewing, and consulting on potential security risks. They can work with in-house personnel to create tailor-made security plans.

 

Agencies cannot afford to take InfoSec lightly. Taking these steps can help government IT teams build a strong security culture. They can then support that culture through knowledge and insights gleaned from strong IT controls, physical security measures, and outside consultants. The result will be a strong InfoSec footing that can be used to curb even the most sophisticated threats before they take hold.

 

Find the full article on Government Technology Insider.

("The Echo Chamber" is a semi-regular series where I reflect back on a talk, video, or essay I've published. More than just a re-hashing of the same topic, I'll add insights as to what has changed, or what I would say differently if I were doing it today.)

 

Back in March 2018, I gave an online talk about monitoring, mapping, and data visualizations in general titled, "If an Application Fails in the Datacenter and No Users Are On It, Will it Cut a Ticket?" If you'd like to listen to the original, you can find it online here: http://video.solarwinds.com/watch/GUHjEnraRAJCKYMDHkDK8D.

 

The talk focused on the power that visualization has in our lives as humans navigating the world, but more importantly as IT practitioners practicing our craft. It looked at how the correct visualization can transform raw data not just into "information" (meaningful data that has a context and a story) but further into action.

 

Looking back now, I realize that I missed a few opportunities to share some ideas—and I plan to correct that in this essay.

 

What Is a Map, Anyway?

In the webinar, I focused on several methods of visualization and how they help us. But I never quite defined the essential features that make a map more than just a pretty picture. For that, I'm going to turn to the preeminent voice speaking about maps as they relate to technology and business: Simon Wardley (@SWardley on Twitter). In short, he states that a picture must portray two things to be a map: position and movement.

 

The best example of a map that doesn't look like a map but IS one, according to Mr. Wardley's definition, is a chess board. If you showed a picture of a chess game at any point in play, it would convey (for those who can read it) both the current position of pieces and where each piece could potentially move in the future. Moreover, to someone VERY familiar with the game, a snapshot of the current board can also provide insight into where the pieces were. All with a single picture. THAT is a map: position and movement.

 

With that definition out of the way, the next missed opportunity is for me to dig into the specific different types of network maps. In my mind, this breaks down into three basic categories: physical, logical, and functional.

 

Mapping the Physical

Mapping the actual runs of cable, their terminations, etc., may be tantalizing in its concrete-ness. It is, in fact, the closest visual representation of your "true" network environment. But there is a question of depth. Do you need every NIC, whether it has something plugged in or not? How about pin-outs? How about cable types? Cable manufacturers? Backup power lines? And of course, it's nearly impossible to generate this type of map automatically.

 

Mapping the Logical

Most network maps fall into this category. It is less interested in the physical layer than the way data connections behave in the environment, and therefore more accurately represents the movement of data even if you can't always tell how the cabling work.

 

Mapping the Functional

This type of map is the one your users and systems administrators want to see: one that represents the way application traffic logically (but not physically) flows through an environment. That said, as a network map, it's sub-optimal because application servers aren't always physical. The depth of the map is in question, and it's purposely obfuscating the network infrastructure in favor of showing data flows, so it's usefulness to network engineers is minimal.

 

For IT practitioners, the question that sits at the core of ALL of this—when to use maps, what kinds of maps to use, what tools to use to make those maps—is a single question:

"What will create those maps automatically, and keep them updated with ZERO effort on my part?"

Because, in my humble opinion (not to mention experience), if a map has to be manually built or maintained, it is more likely NOT to get built and it is almost certainly NOT going to be maintained, which means it is out of date almost as soon as it's published.

 

And take it from me, having a map that is wrong is worse than having no map at all.

 

As a side note, I recently revisited these themes in a larger way as part of a new SolarWinds eBook - Mapping Network Environments, which you can find here.

Do you know what's in your data center? How about your wide area network (WAN)? If you had to draw a map or produce a list of all the things that are connected to your systems in the next week, could you? It sounds like the simplest of things to have, but more often than not, most people have no idea what's really going on in their IT organization.

Years ago our VAR took on a new client. This client was in the medical field and had a really good idea of the technology in their organization. They knew everything that supported their mission to provide value to their customers. However, the senior engineer from our company that was supporting the client wanted to map the entire infrastructure before we took them on. The client told him that it wasn't necessary. He insisted. He spent weeks mapping out every connection. He looked at every device and traced every cable. He produced a beautiful Visio drawing that ended up hanging in their office for years like a work of art.

What did our senior engineer find out? Well, as it turns out, one big thing he found was a redundant wireless bridge on the roof that was used in the past to connect to a building across the street. When he first discovered it, no one knew what it was supposed to do. It took a few days of questions before he found someone that even remembered the time when the company rented space from the ancillary building and wanted it connected. When we brought up the old equipment to the client's IT team, you can imagine the quizzical looks on people's faces. Well, except for the security team. They were more worried than curious.

Why is it so hard to keep track of things? How is it that rogue equipment can appear in our organization before we realize what's going on? In part, it's because of the mentality that we've had for so long that things need to "just work." Instead of creating port security profiles and alerting people when someone plugs a device into the network, we instead choose to enable everything in case someone moves a computer or needs an additional drop activated. Instead of treating our user space as a hostile environment, we open it up in the hopes that our users don't call us for little things that need to be dealt with. This leads to us finding all kinds of fun things plugged into the network causing havoc by the end of the day.

Likewise, we also don't have a good plan for adding equipment behind the scenes. How many times has a vendor offered a proof-of-concept (PoC) trial of equipment and plugged it directly into the network? I'm sure that some of you out there with an Infosec background are probably turning colors right now, but I've seen it more times than I care to count. Rather than taking the time to test equipment with good testing data, the vendor would rather test the equipment against live workloads and push traffic through a PoC to show everyone what it really looks like or how easy their equipment really is to work with.

If you don't know what you're working with in your IT environment, you might as well be trying to work with a blindfold on. You may have switches running as the root of a spanning tree the are from the last century. You may have older virtualized hosts that aren't getting patched any more. You may even find that someone has installed nefarious hardware or software to collect data without your knowledge. And all of that pales in comparison to what might happen if you work in a regulated environment and find out someone has been quietly exfiltrating data around a firewall because you don't have proper controls in place to prevent it.

How well do you know your IT organization? Do you know it well enough to point out every blinking light? If you had to disappear tomorrow would your co-workers know it as well as you? Do you document like your replacement will come looking for you when things go wrong? Leave a comment below and let everyone know how well you know your world.

For many engineers, operators, and information security professionals, traffic flow information is a key element to performing both daily and long-term strategic tasks. This data usually takes the form of NetFlow version 5, 9, 10, and IPFIX as well as sFlow data. This tool kit is widely utilized, and enables an insight into network traffic, performance, and long-term trends. When done correctly, it also lends itself well to security forensics and triage tasks.

 

Having been widely utilized in carrier and large service provider networks for a very long time, this powerful data set has only begun to really come into its own for enterprises and smaller networks in the last few years as tools for collecting and, more importantly, processing and visualizing it have become more approachable and user-friendly. As the floodgates open to tool kits and devices that can export either sampled flow information or one to one flow records, more and more people are discovering and embracing the data. What many do not necessarily see, however, is the correlation of this flow data with other information sources, particularly SNMP-based traffic statistics, can make for a very powerful symbiotic relationship.

 

By correlating traffic spikes and valleys over time, it is simple to cross-reference flow telemetry and identify statistically diverse users, applications, segments, and time periods. Now, this is a trivial task for a well-designed flow visualization tool. It can be accomplished without even looking at SNMP flow statistics. However, where it provides a different and valuable perspective is in the valley time periods when traffic is low. Human nature is to ignore that which is not out of spec, or obviously divergent from the baseline. So, the key is in looking at lulls in interface traffic statistics. View these anomalies as one would a spike, and mine flow data for pre-event traffic changes. Check TCP flags to find out more intricate details of the flows (note: this is a bit of a task as it entails adding TCP flags as they are exported as a numerical value in NetFlow v5 and v9, but they can provide an additional view into other potential issues). Conversely, the flags may also be an indicator into soft failures of interfaces along a path, which could manifest as SNMP interface errors that are exported and can be tracked. Think about the instances where this may be useful: soft failures. Soft failures are notoriously hard to detect, and this is a step in the right direction to doing so. Once this kind of mentality and correlation is adopted, adding in even more data sources to the repertoire of relatable data is just a matter of consuming and alerting on it. This falls well within the notion and mentality of looking at the network and systems as a relatable ecosystem, as mentioned in this post. Everything is interconnected, and the more expansive the understanding of one part, the more easily it can be related to other, seemingly “unrelated” occurrences.

 

 

This handily accomplishes two important tasks: building a relation experience table in an engineer or operators mind, and, if done correctly, a well-oiled, very accurate, efficient, and documented workflow of problem analysis and resolution. When this needs to be sold to management, which will need to occur in many environments, proving out that most of these tracked analytics can be used in concert with each other for a more complete, more robust, more efficient network monitoring and operational experience may need some hard deliverables, which can prove challenging. However, the prospect of “Better efficiency, less downtime” is typically enough to get enough interest in at least a few conversations.

The IT journey is really nothing more than your career plan or goals that you have for your career. It’s the story of you. I call it the IT Journey because it’s a journey through the many phases of your life and not a sprint. It doesn’t have to be a defined plan of what you do, but it certainly helps if you have some sort of plan. Planning and defining your career goals is important because at some point in your working career you will want a raise or promotion. You WILL want more. You will want to take on more than just going to work and pushing buttons. We are humans and it is very natural for us to do that. Defining your goals, your desires for your career and where you are in a year, three years, or even five years helps you achieve more, but it also helps you know what your next steps are.

It feels like yesterday that I was just starting out in IT and my only goal was to find a job “fixing computers.” Back then I didn’t realize that I should’ve had some type of career goal or plan beyond finding a job and keeping it. It didn’t take me long to figure out that having goals would help me achieve more fulfillment.  As I struggled to attain more than merely keeping my IT job, I began to understand the importance of defining what I wanted from my jobs and career. I had this itch for something more substantial. I was working like crazy, but my results were not happening. The promotions were not happening like I wanted them to. Some would say part of that was bias and some discrimination. I wouldn’t argue that wasn’t true because I am sure that had a lot to do with it. Either way, it was part of my journey and changes needed to be made if I wanted to achieve that “more.” It was like a turning point for me when I realized that defining what my journey was going to be like instead of the annual performance review nightmare. I took this time each year to reflect on what my journey was going to be like. Taking my journey by the horns allowed me to advance further in my career. 

You own your life. Your career. You are the author of your story. Build it. Don’t let someone own that position in your life.

At the very least, set some goals for yourself. While professional coaches will tell you to set long-term goals, it can be difficult to do for some given the type of industry we are in. Technology moves so fast that what you are doing now may not be relevant in a few years. Make some short-term goals and long-term goals. Determine what success means to you. Is it working remote? Leading a team? Or as simple as specializing in a type of technology? Success means different things to different people. Envision yourself in 1 year or 5 years, and ask yourself, "What I do I see myself doing?"

Write those goals down and start a plan on how to achieve them. Create milestones that are more attainable and realistic. Be the executive producer of your story. When defining your goals, take it back to these simple questions:

1. What – What are my goals?
2. Who – Who does it take to help me get there? Do I need a supporting cast member?
3. How – How will I do it?
4. Where – Do I need to move or change jobs?
5. When - When does it happen?

When you reach milestones or achievements, reward yourself. Remember this is a journey, not a sprint, and you need to celebrate those wins in your life. There are going to be ups and downs. Success is a result of many failures. Learn from those mistakes to be a better you. Stay focused but also open-minded that the journey may take you down a different path from the original plan. The old saying, “you don’t do it until you try it,” holds very true, even when speaking about careers. You may realize that specializing in something isn’t what you like and decide to change course. That is perfectly okay. In fact, you should “recheck” your goals periodically to see if it’s still the correct path and what you want. You may discover, as people often do, that you as person changes. Your likes and dislikes can change over time, and this can affect how your career continues. Taking charge of your career, being that “star” role with how your journey is played out, will help make it successful for you. 

 

This week’s Actuator comes to you from sunny Austin, where the temperatures have cooled off to a nice 98F this week. I’m in town to record some episodes of SolarWinds Lab as well as THWACKcamp. It’s going to be a long week, but worth every minute!

 

As always, here are some links from the Intertubz that I hope will hold your interest. Enjoy!

 

Facial Recognition Software Wrongly Identifies 28 Lawmakers As Crime Suspects

And the software also identified an additional 28 as hot dogs.

 

Google: Security Keys Neutralized Employee Phishing

I am cautiously optimistic about this idea, but wonder what the workaround is for those of us that easily lose such devices.

 

Simulated Attacks Reveal How Easily Corporate Networks Fall Prey to Hackers

File this post under “not surprising.” It’s 2018, and that means there’s a good chance “2018” is being used as a password somewhere.

 

Snoopware installed by 11 million+ iOS, Android, Chrome, and Firefox users

Also filed under “not surprising,” companies are tracking your movements around the web. This has been true for 20 years now, and yet people still seem surprised that it happens.

 

Data Transfer Project

I thought XML solved this problem 30 years ago. Oh, wait.

 

Drug giant Glaxo teams up with DNA testing company 23andMe

If 23andMe take the time to anonymize the data from their users that have opted-in, then I think this is a wonderful idea to advance research in areas that need this influx of data.

 

Air marshals have conducted secret in-flight monitoring of U.S. passengers for years

Setting aside the civil rights questions, my mind was fascinated with how this data was collected, stored, transmitted, verified, and analyzed. The idea that there is a database somewhere at Ft. Meade that knows I slept during an overnight trip to Germany last month seems…useless.

 

The time is drawing near!

Filter Blog

By date: By tag:

SolarWinds uses cookies on its websites to make your online experience easier and better. By using our website, you consent to our use of cookies. For more information on cookies, see our cookie policy.