When it comes to technology, device management resources have come a long way, just as much as the technology in our actual devices has. As a network or systems admin, you can probably relate to that statement in one way or another. Network admins may have used device profiling at one time or another and you server admins probably pushed out a few changes with group policy I could imagine. Devices that are known to your IT environment are not the issue anymore. While still important, nowadays, the applications and other resources that are available to IT staff everywhere allow us to make changes to devices on a large scale… that is on one condition usually: that they are under our control.

 

The Front Door: The Network

The network provides the first level of protection against BYOD devices a lot of the time. It is the first thing users with outside devices commonly will connect to upon arrival. Proper segmentation of the network is a basic way to provide security to the network when dealing with BYOD devices. This can include firewalls or network access lists to control what these outside devices are able to access. Short, sweet, and to the point. This is a basic way that some companies choose to handle BYOD devices. They simply give these devices web access and restrict access to internal company resources.

 

What happens the first time a vendor comes on-site and needs access to your network to fix a device or an application though? You will not have control of their device, nor will a simple internet connection suffice. They will need permissive, yet secure access to the internal network in one way or another. In terms of the network, device profiling can allow both wired and wireless network users to get an individualized access control list based on their user credentials or device for starters.

 

BYOD Devices And The Software They Bring

BYOD devices can be of many different brands and models and with that they carry a wide range of software as well. Some of these devices are more secure than others. The goal of server admins is the same though: to keep the internal systems and applications secure. One way this can be done is with device posturing. Device posturing is the process of ensuring that devices that come on to the network are up to predetermined system security standards. If they are not, they will not be allowed to connect. Server admins are commonly tasked with ensuring that devices under their control are up to date with the latest security updates and free of malware. Device posturing allows admins to ensure that the security standards they set are upheld by company employees with corporate assets and visitors bringing their own devices onsite.

 

The Users: Who They Are and What They Need

The other way that I want to mention that brings both challenge (and control) to admins concerning BYOD users is around the users themselves. When the users are looked at in a granular sense, security can really be heightened very quickly. So many times, access to internal systems is controlled based on things like the wireless network they are on, or the network vlan they are assigned to. Nothing more. Everyone in that subnet would commonly share a similar set of firewalls rules and permissions. The beginning of the process of getting more granular is by managing security based on user account and user security groups. Users can be given permissions to resources based on their position in the company, a team they are working on, or an application they, as a vendor, are assigned to support. Going a step further is the topic that I previously mentioned and that is around individualized access policies that can work on a per user basis. Regardless, one common theme that is repeated whether you are in networking, server support, or desktop support is that users should only be given the access that is required, and nothing more.

 

Those three topics are some of the common things that come up when the conversation of BYOD is mentioned. Discussing and developing a plan around these things will ensure that you are putting the needed focus on such a sensitive topic. While this is not a guide for BYOD devices in your network, these three areas of focus will be a good start to securing BYOD devices in your IT environment.

It’s that time of year again. SolarWinds 2018 IT Trends Report is out! See the SolarWinds IT Trends Index for the full story; I've covered the top takeaways below.

 

Cloud computing and hybrid IT remain IT professionals' top priority for the next five years because these elements meet today's business needs while serving as the foundation for constructs like machine learning and AI.

  • 94 percent of IT professionals surveyed indicate that cloud and/or hybrid IT is one of the top five most important technologies in their IT organization's technology strategy today, with 51 percent listing it as their number one most important technology.
  • IT professionals ranked cloud and hybrid IT as the most important technologies (by weighted rank) over the next three to five years in digital transformation and as technologies that have the greatest potential to provide productivity/efficiency benefits and ROI.

 

At the same time, IT professionals are prioritizing internal investments in containers as a proven solution to the challenges of cloud computing and hybrid IT, and a key enabler of innovation.

  • 44 percent of respondents ranked containers as the most important technology priority today, and 38 percent of respondents ranked containers as the most important technology priority three to five years from now.
  • Concurrently, AI and ML investments are expected to increase over the next three to five years.
  • 37 percent of respondents indicate AI is the biggest priority and 31 percent of respondents indicate ML is the biggest priority three to five years from now (compared to 29 percent and 21 percent today, respectively).

 

 

The results of the IT Trends Survey suggest a dissonance between the views of IT professionals and their senior managers on priorities for IT investment over the next three to five years.

  • On the weighted list of technologies IT professionals believe are needed for an IT organization's digital transformation over the next three to five years, AI did not make the top five.
    • This contrasts with a recent CEO survey, which found that 81 percent of CEOs consider AI and machine learning to be a priority for their business, up from just 54 percent in 2016 (Fortune).

 

 

While IT professionals continue prioritizing cloud computing and hybrid IT, adoption of these technologies has made it challenging to optimize performance of their systems and applications.

  • 58 percent of IT professionals surveyed indicated that by weighted rank, cloud/hybrid IT presents the greatest challenges when it comes to implementation, roll-out and day-to-day performance
  • Nearly half (47 percent) of all IT professionals surveyed think that their IT environments are not operating at optimal levels
    • Over half of all IT professionals surveyed spend less than 25 percent of their time proactively optimizing performance
    • Nearly half of IT professionals spend 50 percent or more of their time reactively maintaining and troubleshooting their IT environment

 

Many IT professionals cite a lack of organizational strategy and inadequate investment in areas such as user training as the most common barriers to system optimization.

  • Of IT professionals indicating their environments are not optimized, 43 percent ranked inadequate organizational strategy as one of the top three barriers to achieving optimization
    • To achieve true performance and work toward a successful digital transformation, IT professionals require deeper strategic collaboration with business leaders.

 

Do these takeaways mirror your organizational priorities and challenges? Let me know in the comment section below.

Who’s passed more than a weekend going almost blind because something in your <replace with your beloved system, not necessarily IT> didn't work as it should, and it produced, hopefully, thousands of lines of error messages, almost all of which were the same? The key words here are, “almost the same."

 

A syslog system could save you precious time, and your family will be thankful for it, but it's up to you to find a way to filter the lines that are "almost" like the others, the ones that are different save for a single cypher that makes your day.

 

Syslog is a tool. Actually, it is the first logging tool. Try to put a screwdriver beside your PC, you won't be disappointed if it doesn't disassemble the machine. The tool leverages your skills, it doesn’t replace them.

 

There are several tricks to filter the lines produced by a syslog. The key is to find the right pattern.

 

Today there are so many GUI tools to perform these operations, but the core is the same: match a pattern to the bundle of lines, and filter the ones that will help to solve your problem.

Taking a step back, during syslog configuration, you can set it up to write the logs in a manner that is closer to your pattern’s creation.

 

We could ask to put the time-data at the beginning of the line, or maybe the severity, or, again, the service involved. That's not important. What is important is that you do it in a way that suits you.

 

So, the pattern. This is the critical point, the difference between success or wasted nights. Similar as a Google research, you need to find the right keyword for the best result.

There are several free tools to parse logs, and the same goes for patterns. There are many databases of them, but maybe your issue is more specific? Anyway, a good starting point is log parsing.

 

Today there are many advanced utilities born from syslog, a number of which are open source: rsyslog, syslog-ng, logwatch, just to name a few. The main difference is that rsyslog apply filters on the logs produced by syslog to perform actions, for example, if present the word "localhost" sends an email to someone@domain.com, or if the source IP 10.10.10.10 is present it writes the line on file "10101010".

 

Syslog-ng is a more complex utility that not only filters, but also correlates and classifies. Logwatch is interesting, and I’ll cover it in a future post.

 

All of the tools use a set of precompiled patterns, in many cases modifiable. Let me say that it's quite unusual to not find the right filter for your very specific requirements.

Besides these tools there's another subset to consider when talking of logs: SNMP - traps and polling. Usually they're used for monitoring purposes instead of analysis, but the core concept is the same: a tool that writes lines that are constantly filtered by another tool that sends traps - or waits to be read by a poller to raise an alert. The first part of the process is the same: logging - and the second is similar too: filtering.

 

So, enjoy, try, install, reinstall, destroy, but above all, keep logging! It can save you a lot of precious free time, and it can help your peers as well, even those from other parts of the world.

March 31 was World Backup Day. If you are like me, you probably spent most of your day burning old CDs to tape storage.

 

I often see forum posts from accidental administrators who want to know how to recover data without a backup. The short answer is, “Now is a good time to work on your resume.” The longer answer is, “Recreate all your data.”

 

But the truth is that you shouldn’t ever be in this position. The number one job for any administrator is recovery. If you can’t recover, you can’t keep your job.

 

So, here are six ways for you to help protect your backups and your job.

 

Know What You Need

Many of those forum posts share a common thread, which is this: the contributor clearly does not know about the system he is tasked with recovering. So, the first step is to start making a list of all your servers and applications. Ask people the simple question, “What are the critical systems and applications you work with every day, week, and month?” Don’t forget to answer these questions yourself. Make a list, use it as a reference, and keep it updated. This is where monitoring tools that have auto-discovery are your best friend.

 

Configure Your Backups

This seems like something Captain Obvious would tell you, but yeah, configure the backups. It’s not enough to just know what needs to be backed up, you need to make sure backups are in place. Take care to note data volume here, as you may not want all your backups happening at the same time and flooding the network. Or, worse yet, having your backups run longer than 24 hours, causing your backup software to start a new day before the previous day is complete. Good times.

 

Verify Backups Are Happening

You must build a process to ensure that the backups are happening. My preference here is to make sure I have three pieces of information. The first is that the backup job ran without error. The second is that the backup media is available. The third is that the backups remain consistent with our RTO and RPO requirements.

 

Test Your Recovery

Backups are valuable, but restores are priceless. You should be testing your recovery process on a frequent basis. Many companies do DR testing once or twice a year. I find that the volume of data grows far too much in that length of time, making DR exercises difficult. I advocate frequent testing of the recovery process to verify that the backup media is good, and that the RTO and RPO requirements are being met.

 

Protect Your Backups

For database backups, I like using passwords and encryption. Anything you can do to take an extra step of security to protect that data is worth your time. You should approach your backups with a very simple concept: assume it will be lost or stolen. If it was lost or stolen, make sure you minimize your risk by protecting the backup in some way.

 

Consider Extra Copies

If your data is critical, you want to consider having extra copies of your backups. I like the idea of using a mix of offsite tape storage and a cloud backup provider. That way I reduce my risk by storing different formats in different locations. Just make certain that you have defined an RPO and RTO for each method being used.

 

Summary

Backups are necessary for your business continuity planning. It is often easier to build a recovery plan first because that will often dictate your backup strategy. Whatever backup strategy you deploy, these six steps will help you ensure that your next disaster does not result in a resume-generating event.

This week's Actuator comes to you from spring, where, for the first time in months, it is not snowing as I write. I've already brought out the patio furniture and I'm hoping in the next week or two to move into the outdoor office. It's going to feel nice to sit outside and enjoy the sunshine.

 

As always, here are some links from the Intertubz that I hope will hold your interest. Enjoy!

 

AWS Lambda Still Towers Over the Competition, but for How Much Longer?

AWS is poised to own 70% of all future serverless applications. Not bad for a book store that recently decided to sell groceries.

 

Titus, the Netflix container management platform, is now open source

If your company is just getting started with containers and you have questions about how to manage them, check out what they do at Netflix. Chances are Netflix might know a thing or two about things like scalability and DevOps. You can benefit from that knowledge. For free.

 

Attackers exfiltrated a casino’s high-roller list through a connected fish tank

Ocean’s Eleven, no doubt.

 

Microsoft launches a phishing attack simulator and other security tools

It’s wonderful to see Microsoft taking such a proactive approach to security. We’ve spent decades building apps with performance and convenience priorities ahead of security. It’s nice to see security being placed first, where it belongs.

 

Facebook Admits to Tracking Non-Users Across the Internet

So, if you don’t want Facebook to track you, all you need to do is join Facebook and tell them to stop. This is why Zuckerburg is worth billions and I’m still tuning queries.

 

No boundaries for Facebook data: third-party trackers abuse Facebook Login

I’m shocked, shocked to discover these third-party applications may be a risk to my data privacy. Wait a minute. No, I am not shocked. This is expected behavior from people and companies that don’t mind abusing the principle of informed consent in order to earn a dollar.

 

Meet Boston Dynamics' Family Of Robots

We’re doomed.

 

Remember when you could buy software inside the Apple store? Here's the only software they sell now:

Monitoring has always been a loosely defined and somewhat controversial term in IT organizations. IT professionals have very strong opinions about the tools they use, because monitoring and alerting is one of the key components of keeping systems online, performing, and delivering IT’s value proposition to the business. However, as the world has shifted more toward a DevOps mindset, especially in larger organizations, the mindset around monitoring systems has also shifted. You are not going to stop monitoring systems, but you might want you want to rationalize what metrics you are tracking in order to give better focus.

 

What To Monitor?

 

While operating systems, hardware, and databases all expose a litany of metrics that can be tracked, choosing that many performance metrics can make it challenging to pay attention to critical things that may be happening in your systems. Such deep dive analysis is best reserved for deep troubleshooting of one-off problems, not day-to-day monitoring. One approach to consider is classifying systems into broad categories and applying specific indicators that allow you to evaluate the health of a system.

 

User Facing Systems like websites, e-commerce systems, and of course everyone’s most important system, email, have availability as the most important metric. Latency and throughput are secondary metrics, but especially for customer facing systems are equally as important.

 

Storage and Network Infrastructure should emphasize latency, availability, and durability. How long does a read or write take to complete, or how much throughput is a given network connection seeing?

 

Database systems, much like front-end systems, should be focused on end-to-end latency, but also throughput--how much data is being processed, how many transactions are happening per time period.

 

It is also important to think about which aspect of each metric you want to alert (page an on-call operator). I like to approach this with two key rules: any page should be on something that can be actionable (service down, hardware failures), and always remember there is a human cost to paging, so if you can have automation respond to page and fix something with a shell script, that’s all the better.

 

It is important to think about the granularity of your monitoring. For example, the availability of a database system might only need to be checked every 15 seconds or so, but the latency of the same system should be checked every second or more to capture all query activity. At each layer of your monitoring, you will want to think about this. This a classic tradeoff of volume of data collection in exchange for more detailed information.

 

Aggregation

 

In addition to doing real-time monitoring, it is important to understand how your metrics look over time. This can give you key insights into things like SAN capacity and the health of your systems over time. Also, it leads you to be able to identify anomalies and hot spots (i.e. end of month processing), but also plan for peak loads. This leads to another point - you should consider collection of metrics as distributions of data rather than averages. For example, if most of your storage requests are answered in less than 2 milliseconds, but you have several that take over 30 seconds, those anomalies will be masked in an average. By using histograms and percentiles in your aggregation, you can quickly identify when you have out of bounds values in an otherwise well-performing system.

 

 

Standardize

 

Defining a few categories of systems, and standardizing your data collection allows for common definitions, and can drive towards having common service level indicators. This allows you to build a common template for each indicator and common goal towards higher levels of service.

By Paul Parker, SolarWinds Federal & National Government Chief Technologist

 

Here is an interesting article from my colleague Joe Kim, in which he discusses the impact of artificial intelligence on cybersecurity.

 

Agencies are turning to artificial intelligence (AI) and machine learning to bolster the United States’ cybersecurity posture.

 

Agencies are dealing with enormous amounts of data and network traffic from many different sources, including on-premises and from hosted infrastructures—and sometimes a combination of both. Humans can’t sift through this massive amount of information, which makes managing security a task that cannot be exclusively handled manually.

 

AI alleviates many of these challenges. Machines can automatically comb through millions of packets of information and detect suspicious behavior. The more data these machines analyze, the more intelligent they become, and the better they are at noticing, predicting, and preventing security breaches.

 

But while AI offers many great benefits, it should not be considered a replacement for human intervention or existing network monitoring tools. Instead, AI should complement and support the people and tools that agencies are already using to keep their networks safe.

 

The human factor remains critical

 

The cyber threat landscape continues to change rapidly, and some aspects of that landscape require human intervention now more than ever before. Respondents to our Federal Cybersecurity Survey indicated a wide range of threat sources, from foreign governments to hackers, terrorists and, beyond.

 

The biggest threat, though, appears to come from careless or untrained insiders, with 54 percent of respondents listing them as their top concern. This point exemplifies why people still very much matter when it comes to cybersecurity. Even though machines and systems can be highly effective at preventing suspicious behavior, they are not great at training staff to adhere to agency policies or practice strong overall security hygiene.

 

Of course, AI can certainly help prevent malicious or careless insiders from doing damage. Automatic detection of suspicious activity and immediate alerts can help managers respond more quickly to potential threats. It can also be used to fill in gaps resulting from the lack of human resources or security training, and significantly decrease the time it takes to analyze data. As such, AI can reduce attack identification and response times from days to hours or even minutes.

 

Even so, humans will still be needed to react to and implement those responses. They remain a critical piece of the cybersecurity puzzle.

 

Traditional monitoring solutions are still vital

 

Just as humans will continue to play an important role in network security in the age of AI, tools such as security information and event management (SIEM) systems, network configuration management and user device monitoring programs should remain a foundational element of agencies’ initiatives. These solutions supplement AI by extracting information from the constant noise, allowing managers to focus on truly critical issues and pinpoint security threats.

 

Like AI tools, traditional network monitoring programs can analyze huge amounts of data. They complement this ability with continuous monitoring of user activity and network devices and provide automated threat intelligence alerts along with contextual information to help managers act on that information. Indeed, our survey indicated that these tools continue to play a significant role in keeping networks protected; for example, 44 percent of respondents using some form of device protection solution stated they are able to detect rogue devices within minutes.

 

In short, while AI is extremely useful, it should not be used exclusively. Instead, agencies should plan on augmenting existing best practices and the abilities of their staff with AI. Because although AI is good and here to stay, it’s the use of tried and true resources that will continue to lift up the machines as they rise.

 

Find the full article on SIGNAL.

Recently, ITWorld asked me to share some thoughts on "IT's Worst Addictions (And How to Cure Them)" (https://www.itworld.com/article/3268305/it-strategy/worst-it-addictions-and-how-to-cure-them.html). While I had shared a number of thoughts on the topic, space and format restricted the post so that only a couple of my ideas were printed. I wanted to share a more complete version with you here.

 

Sensitivity First

The tone of the original article was fairly light, using the word "addiction" in it's informal, rather than medical, context. This is understandable, and in that framework it's easy to lapse into AA-style thinking/language that conflates “IT addictions” with true  addictive behaviors and issues. I think doing so would be unfair to individuals (and their families,  friends, and coworkers) who are dealing with the very real and very serious impact of real addictions every day. I want to avoid trivializing something that has caused so much real trauma and pain, stolen years, and lost lives.

 

At the same time, I recognize that the obsessive behaviors we’re discussing can be remarkably similar to true addiction. Therefore, traditional conversations about addiction may be a source of guidance and wisdom for us.

 

In this post, I hope it's clear that this is a line I'm treading sensitively so that it's clear I'm not making light of a serious topic.

 

That said, over the course of my career I have noticed there are certain behavioral traps and anti-patterns that IT professionals fall into.

 

Let’s start with the IT pro obsessions that everyone thinks of that I have no desire to talk about, because they are well-known and have been chewed over thoroughly:

  • Everything to do with your phone (duh)
  • Communication channels (email, slack, work IM, etc.) (duh)
  • Coffee (duh)

 

Those are the obvious ones. Now let's look at some that are not so obvious:

 

Checking that screen one more time

What “that screen” is differs for each IT pro, but we all have that one thing we compulsively check. It could be the NOC dashboard; it could be the performance tracker for our “baby” system; it could be the cloud statistics. One would hope that for many, it’s the monitoring dashboard.

 

The latest and greatest

This refers to the compulsive need to update, whether we can make a valid financial justification for it or not. Again, the specific manifestation varies. It could be the latest phone, tablet, or laptop, the newest phone service (Google Fi, anyone?), the fastest home internet service, or pro-sumer grade equipment.

 

Monitors

(The hardware kind. I wouldn't ever say you could have too many SolarWinds monitors!)

There are very few IT pros who would say "no" to adding one (or four) more screens to their system, if they had the option. Better still, this desire does not hinge on how many screens one already has. More is always better.

 

Training/Certifications

As strange as it sounds, some IT pros have to be on top of the latest learning. That means lifetime subscriptions to online courses, obsessively upgrading certifications, and more.

 

News

Many IT pros are hopeless news junkies. It may manifest in a single area (politics, sports, tech trends, entertainment) or a combination of those, but the upshot is that we want to know the latest updates, whether they come on our mobile device, the third screen of our main computer, or good old fashioned wood pulp dropped at our front door each morning.

 

Collectibles

Once again, this obsession has a nearly infinite number of variations, including LEGO sets, watches, comic books, figurines. and more. Many IT pros have “that thing” that they go out of their way (and often break their budget) for.

 

(It should be noted that SolarWinds, with our ever-expanding array of buttons and stickers sporting unique ideas, happily feeds into this obsession.)

 

Community

Contrary to the stereotype of the nerdy loner, IT pros tend to be very dedicated to building and being part of a community (or several). While these communities often have an online component, most focus on (and culminate in) an IRL meet-up where they can share stories, offer support, and just bask in the glow of like-minded folks. These communities might be vendor-supported (SWUG, CiscoLive, Microsoft Ignite, etc); vendor-agnostic but professionally oriented (SQL Saturdays, DevOpsDays, PHP.ug, etc.), non-professional but infinitely geeky (D&D conventions and Comic Cons rank high on this list, but are by no means the only examples); or otherwise focus on cultures, medical challenges, car ownership, and more. The point is that IT pros often become deeply (some might say obsessively) involved in these communities and seeing them thrive.

 

The sharing corner

So what are YOUR compulsive IT distractions? Let me (and the rest of us) know in the comments below. Based on feedback, I may even pull together some thoughts on how we all can address the negative aspects of these behaviors and become better for the effort.

One of the biggest draws of the public cloud includes services like managed Kubernetes or server-less functions. Managed services like these enable IT organizations to consume higher-level services, which allow the IT organization to focus their efforts on opportunities to create business value from technology.

 

Configuration management tools like Chef, Puppet, and Ansible are central to modern cloud deployments. These tools enable an automated and consistent configuration of instances. This allows administrators to utilize cloud-native practices like immutable infrastructure, in which instances or servers are treated as low-value objects that can be easily recreated, as opposed to long-living servers that are carefully maintained.

 

Each of the popular configuration management tools utilize a server/agent construct in which the agent or node is managed by the configuration management server and pulls its configuration from the server. This introduces long living infrastructure in the form of the configuration management servers that must be maintained. Creating automation cookbooks or modules is challenging enough without having to provision and maintain the infrastructure required to facilitate the automation.

 

The benefits of managed configuration management are:

  1. Quickly test new versions - One of the challenges with configuration management tools is keeping up with the release cycle of the software. Utilizing a managed solution enables IT teams to quickly spin up a configuration management server and rapidly test new features with little hassle.
  2. Simplify upgrades - Once a new version of the configuration management tool has been tested and deemed production ready then the process of upgrading the server infrastructure begins. This requires a considerable amount of time and effort from the engineer. With a fully managed solution all of that time and effort is given back to the engineers.
  3. Enable isolated automation development environments - The ability to provision a production like environment along with the configuration management platform gives automation engineers an isolated environment to test their automation changes with a greater assurance that it won't break a shared environment.
  4. Scalable - Building the configuration management infrastructure that scales properly as the environment reaches thousands and tens of thousands of nodes is incredibly complex and makes things like upgrades that much more painful. The ability to utilize a single solution for ten nodes or ten thousand nodes is incredibly valuable.

 

Managed Deployment

The following solutions are managed deployments. This means the configuration management software company has added a deployment solution to the respective cloud provider's marketplace to allow the infrastructure to be provisioned with the click of a button.

 

Chef Automate

The Chef Automate platform is a configuration management platform created by Chef Software that provides an end-to-end solution for automation engineers to develop, test, and deploy their cookbooks. AWS and Azure offer a marketplace offering of a Chef Automate that can be deployed and ready to use within minutes.

 

Puppet Enterprise

Puppet Enterprise is a configuration management platform created by Puppet that provides a standard set configuration management functionality. Both AWS and Azure offer a marketplace deployment option.

 

Fully Managed

The following solutions are fully managed configuration management solutions such that the cloud provider manages your configuration management platform on your behalf and allows engineers to focus on automation cookbooks or modules.

 

AWS OpsWorks

AWS OpsWorks is a fully managed solution that completely abstracts the server infrastructure related to configuration management. This allows organizations to take full advantage of configuration management tools like Chef and Puppet without the administrative overhead of managing a server.

 

Azure Automation

Azure automation utilizes PowerShell DSC (Desired State Configuration) for managed configuration management, this fits perfectly in-line with Microsoft's vision for PowerShell to manage all things.

 

Ultimately the value proposition for configuration management tools is in the consistent automated configuration of the instances and not in managing the infrastructure to support the configuration management tools. Future posts will delve into other core operational aspects that are critical to cloud environments but in and of themselves don't provide much value.

I've been responsible for Disaster Recovery and Business Continuity Preparedness for my company for nearly eight years. In my role as Certified Business Continuity Professional, I have conducted well over a dozen DR exercises in various scopes and scale. Years ago, I inherited a complete debacle. Like almost all other Disaster Recovery Professionals, I am always on the lookout for better means and methods to further strengthen and mature our DR strategy and processes. So, I roll my eyes and chuckle when I hear about all of these DRaaS solutions or DR software packages.

 

My friends, I am here to tell you that to oversee a mature and reliable DR program the devil is in the details. The bad news is that there is no real quick fix, one-size-fits- all, magic wand solution available that will allow you to put that proverbial check in the "DR" box. Much more is needed. Just about all the DR checklists and white papers I've ever downloaded, at the risk of being harassed by the sponsoring vendor, pretty much give the same recommendations. What they neglect to mention are the specifics, the intangibles, the details that will make or break a DR program.

 

First, test. Testing is great, important, and required. But before you schedule that test and ask the IT department, as well as many members from your business units, to participate and waste a portion of their weekend, you darn well better be ready. Remember, it is your name that is on this exercise. You don’t want to have to go back however man months later and ask your team to give up another weekend to participate. Having to test processes only to fail hard after the first click will quickly call your expertise into question.

 

Second, trust but verify. If you are not in direct control of the mission critical service, then you audit and interview those who are responsible, and do not take their word for it when they say, "It'll work." Ask questions, request a demonstration, look at screens, walk through scenarios and always ask, "What if...?"

 

Third, work under the assumption that the SMEs aren't always available. Almost every interview has a Single Point of Failure (SPoF) by the third "What if?" question.

"Where are the passwords for the online banking interfaces stored?"

"Oh! Robert knows all of them." answered the Director of Accounts Payable.

"What if Robert is on vacation on an African safari?"

"Oh!" said the director. "That would be a problem."

"What if we didn't fulfill our financial obligations for one day? Two or three days. A week?" I asked.

"Oh! That would be bad. Real bad!"

Then comes the obligatory silence as I wait for it. "I need to do something about that." Make sure you scribble that down in your notes and document it in your final summary.

 

Fourth, ensure the proper programming for connectivity, IP/DNS, and parallel installations. This is where you will earn your keep. While the DRaaS software vendors will boast of simplicity, the reality is that they can only simplify connectivity so much. Are your applications programmed to use IP and not FQDNS? Does your B2B use FTP via public IP or DNS? And do they have redundant entries for each data center? The same question can be applied to VPNs. And don't forget parallel installations, including such devices as load balancers and firewalls. Most companies must manually update the rules for both PRD and DR. I've yet to meet a disciplined IT department that maintains both instances accurately.

 

Fifth, no one cares about DR as much as you do. This statement isn't always true, but always work under the assumption that it is. Some will care a whole lot. Others will care a little. Most will hardly care at all. It is your job to sell the importance of testing your company's DR readiness. I consistently promote our company's DR readiness, even when the next exercise isn't scheduled. My sales pitch is to remind IT that our 5,000+ employees are counting on us. People's mortgage payments, health insurance, children’s tuition all rely on paychecks. It is our duty to make sure our mission-critical applications are always running so that revenue can be earned and associates can receive those paychecks. This speech works somewhat because, let's face it, these exercises are usually a nuisance. While many IT projects push the business ahead, DR exercises are basically an insurance policy.

 

Sixth, manage expectations. This is pretty straightforward, but keep in mind that each participant has his/her own set of expectations. Whether it be the executives, the infrastructure teams and service owners, or the functional testers. For example, whenever an executive utters the words "pass" or "fail," immediately correct them by saying, "productive," reminding them that there is no pass/fail. Three years ago I conducted a DR exercise that came to a dead stop the moment we disconnected the data center's network connectivity. Our replication software was incorrectly configured. The replicators in DR needed to be able to talk to the Master in our production data center. All the participants were saying that the exercise was a failure, which triggered a certain level of panic. I corrected them and said, "I believe this was our finest hour!" Throughout your career, you should be prepared to correct people and help manage their expectations.

 

Seventh, delegate and drive accountability. Honestly, this isn't my strong suit. With every exercise that I have conducted, the lead-up and prep often finds dozens of gaps and showstoppers. What I need to be better at doing is holding the service owners accountable and delegate the responsibility of remediation when a gap or showstopper is identified. Instead, I often fall back on my 20+ year IT background and try to fix it myself. This consumes my time AND lets the service owners off the hook. For example, while prepping for my most recent exercise, I learned that a 2TB disk drive that contains critical network shares had stopped replicating months ago. The infrastructure manager told me that the drive was too big and volatile and that it was consuming bandwidth and causing other servers to fail their RPO. Once I got over my urge to scream, I asked what the space threshold was that needed to be reached to be able to turn the replication back on. I then asked him what he could do to reduce disk space. He shrugged and said, "I don't know what is important and what isn't." So, I took the lead and identified junk data and reduced disk space by 60 percent. I should have made him own the task, but instead took the path of least resistance.

 

Eight, documentation. Very few organizations have it. And those who do have documentation usually have only what is obsolete. The moment it is written down, some detail has changed. Also, what I have learned is that very few people refer to documentation after it is created.

 

So, there you have it. I have oodles more, but this article is long enough already. I hope you find what I shared useful in some capacity. And remember, when it comes to DR exercises, the devil is in the details.

When designing the underlying storage infrastructure for a set of applications, several metrics are important.

 

First, there’s capacity. How much storage do you need? This is a metric that’s well understood by most people. People see GBs and TBs on their own devices and subscription plans on a daily basis, so they’re well aware of it.

 

There’s also performance, which is a bit more difficult. People tend to think in terms of “slow vs. fast," but these are subjective metrics. For storage, the most customer-centric metric is response time. How long does it take to process a transaction? Response time is, however, the product of a few other metrics, including I/O operations per second, the size of an I/O, and the queue depth of other I/O in front of you.

 

Sizing a storage system

If you size a storage system to meet both capacity and peak performance requirements, you will generally have low response times. Capacity is easy; I need X Terabytes. Ideally, you’d also have some performance numbers to base the size of your system on, including expected IOps, I/O size, and read:write ratio to name a few. If you don’t have these performance requirements, a guesstimate is often the closest you can get.

 

With this information, and an idea of which response time you’re aiming for, it’s possible to configure a system that should be in the sweet spot. Small enough to make it cost effective, yet large enough that you can absorb some growth and/or unexpected peaks in performance and capacity. Depending on your organization and budget, you might undersize it to only cover the 95th percentile peak performance, or you might oversize it to facilitate growth in the immediate future.

 

Let it grow, let it grow… and monitor it!

Over time though, your environment will start to grow. Data sets increase and more users connect to it. Performance demands grow in step with capacity. This places additional demands on the system; demands that it wasn’t sized for initially.

 

Monitoring is crucial in this phase of the storage system lifecycle. You need to accurately measure the capacity growth over time. Automated forecasts will help immensely. Keep an eye on the forecasting algorithms and the statistics history. If the algorithm doesn’t use enough historical data, it might result in extremely optimistic or pessimistic predictions!

 

Similarly, performance needs to be guaranteed throughout the life of the array. The challenge with performance monitoring is that it’s usually a chain of components that influence each other. Disks connect to busses, which connect to processors, which connect to front-end ports, and you need to monitor them all. Depending on the component that’s overloaded, you might be able to upgrade it. For example, connect additional front-end ports to the SAN or upgrade the storage processors. At some point though, you’re going to hit a limit. Then what?

 

Failure domain

Fewer, larger systems have several advantages over multiple smaller arrays. There are fewer systems to manage, which saves you time in monitoring and day-to-day maintenance. Plus, there’s fewer losses, as silos tend to not be fully utilized.

 

One important aspect to consider, though, is the failure domain. What's the impact if a system or component fails? Sure, you could grow your storage system to the largest possible size. But if it fails, how long would you need to restore all that data? In a multi-tenancy situation, how many customers would be impacted by a system failure? Licenses for larger systems are sometimes disproportionally more expensive than their smaller cousins; does this offset the additional hassle of managing multiple systems? There’s multiple approaches possible. Let me know which direction you’d choose: fewer, bigger systems, or multiple smaller systems!

This week's Actuator comes to you from an unseasonably cold spring here in New England. It snowed this week, making for a chilly Boston Marathon. Here's hoping we get rewarded later with a few extra weeks of summer warmth through October.

 

As always, here are some links from the Intertubz that I hope will hold your interest. Enjoy!

 

In a Leaked Memo, Apple Warns Employees to Stop Leaking Information

Apple, one of the least security focused manufacturers on the planet, has issues keeping secrets. Sounds about right.

 

Strong feedback loops make strong software teams

Not just software, but all teams. An important part of the feedback loop is how a person handles the feedback. Different people have different tolerance levels for criticism of their work. It’s important to remember that when building teams, or assigning tasks.

 

Verizon 2018 Data Breach Investigations Report: Tales of dirty deeds and unscrupulous activities

Verizon published their annual data breach report, showing us details about who is behind the organized attacks, their motivations, and the industries targeted. This report should be required reading for everyone working in IT.

 

Waymo seeks permission to test fully driverless cars in California

It’s been a while since I talked about autonomous cars, so here’s a quick update. If you didn’t know, Waymo is a division of Alphabet, which is Google’s parent company. So, Waymo is Google. Let that sink in and think about the data that Google will collect about you and your travel habits.

 

Mark Zuckerberg's Congressional testimony showed that a bedrock principle of online privacy is a complete and utter fraud

Informed consent is the loophole that internet companies have exploited for decades. Maybe now we can work on closing that loophole.

 

User Privacy Isn't Solely a Facebook Issue

A nice reminder that data security and privacy is a bigger issue. Your ISP has the most control over your security and privacy online.

 

'Dear Mark, this is why I hate you.' An open letter to Zuckerberg

To opt-out of Facebook infringing on your privacy, you must first sign up to use Facebook. I can’t understand why more people aren’t outraged to the point that we just shut Facebook down completely.

 

Speaking of spring, now's a good time to get started on that re-wiring project you've been putting off for a while.

 

By Paul Parker, SolarWinds Federal & National Government Chief Technologist

 

There’s a lot of attention being paid to innovative and groundbreaking new technologies like machine learning (ML), artificial intelligence (AI), and blockchain. But does the reality surrounding these technologies match the hype?

 

That’s one of the key questions featured in the Public Sector version of the SolarWinds IT Trends Report 2018: The Intersection of Hype and Performance. We surveyed more than 100 IT practitioners, managers, and directors from public sector organizations. Our goal was to gauge their perspectives on the technologies that are making the biggest differences for their agencies.

 

Below are some of the top findings from this year’s report. Complete details can be found here.

 

Hybrid IT and Cloud Remain Top Priorities

 

Despite the excitement surrounding emerging technologies, IT professionals are continuing to prioritize investments related to hybrid IT and cloud computing. In fact:

 

  • 97 percent of survey respondents listed hybrid IT/cloud among the top five most important technologies to their organization’s IT strategies
  • 50 percent listed hybrid IT/cloud as their most important technologies

 

Automation, the Internet of Things, Big Data, and Software-Defined “Everything” are Gaining Steam

 

According to the respondents, the rest of the top five most important tools include:

 

  • Automation (#2)
  • Big data analytics (#3)
  • The Internet of Things (#4)
  • Software-defined everything (#5)

 

Automation and big data analytics scored highly in the category “most important technologies needed for digital transformation over the next three to five years.”  Respondents were also bullish on the productivity and efficiency benefits of automation and big data analytics, as well as their potential to deliver a high ROI. Still, hybrid IT/cloud continued to set the pace as the top spot in all categories.

 

Reality Hasn’t Caught Up with the Hype around AI, Machine Learning, Blockchain, and Robotics

 

While there’s been a great deal of media attention and interest in these four technologies, the reality is, for now, IT professionals are focusing their attention on proven technologies that can deliver an immediate value in their predominantly complex hybrid IT environments. Respondents do not deny the importance of AI, machine learning, blockchain, and robotics technologies. They are simply not designating them as “mission critical”—at least, not yet.

 

However, these solutions are expected to gain increasing prominence over the next few years. Namely:

 

  • 34 percent of respondents believe that AI will be the primary technology priority over the next three to five years
  • 36 percent feel the same about machine learning
  • Other technologies mentioned include robotics (by 14 percent of respondents) and blockchain (7 percent)

 

Can’t Contain Containers

 

Conversely, open source containers turned out to be big movers this year. Forty-four percent of respondents ranked containers as one of their most important technology priorities today. That’s a huge jump over last year’s report, in which just 16 percent of IT professionals indicated they were working on developing containerization skills 2017 IT Trends Report.

 

Containers can help organizations increase agility while addressing some challenges introduced by hybrid IT/cloud environments. As noted by respondents, those challenges include:

 

  • Environments that are not optimized for peak performance (46 percent of respondents)
  • Significant time spent reactively maintaining and troubleshooting IT environments (45 percent of respondents)

 

Technology Adoption Barriers Remain

 

Survey respondents also identified some key challenges standing in the way of technology adoption. In addition to IT environments not operating at optimal levels, many IT professionals cite a lack of organizational strategy and inadequate investment in areas like user training.

 

  • 44 percent of respondents ranked inadequate organizational strategy as one of the top three obstacles to better optimization
  • User and technology training was also considered a barrier by 43 percent of respondents

 

More information on the SolarWinds annual IT Trends Report and the complete public sector results can be found here.

 

Are these findings consistent with what you’re seeing in your organization? Or, are you experiencing something completely different? Share your perspective in the comments.

Enterprise networks and IT environments can be a very unique type of organization to work with. No matter what division is involved, change management can be a stressful thing for an IT environment if not handled correctly. If proper planning is made, then changes can go smoothly! Regardless, there are some nuances you want to keep in mind and problems to be sure to avoid.

 

Other Teams Within The Change Management Process

 

When it comes to change management, the team making the change generally has all of their ducks in a row. They have the change thought through, tested, and planned out. When it comes to making that change, they know who is doing what and exactly what needs to happen. Then the curveballs get thrown. How many times have you come across a situation like this:

 

The network team wants to make a change bringing down the edge internet routers while the server admins are doing a mail server migration to the cloud for a group of users. The loss of an edge connection causes the mailbox upload to stop and the server admins to exceed their allotted downtime window to complete the migration.

 

This is probably not a scenario that is uncommon to many people working in enterprise IT environments. So many times, certain divisions can become very narrow-minded and have a lack of regard for other teams under the greater scope of the IT team as a whole. Those narrow-minded teams are choosing to only focus on their changes and projects. Breakdowns in communication like this have a tendency to escalate into larger, more difficult issues. The reality of the situation is that the teams involved may not even be part of your organization and may include providers like cloud vendors, for example. All of these teams, both internal and external, need to be included in the communication process when it comes to planning your changes.

 

Know What You Are Affecting Downstream

 

It’s no surprise that enterprise IT structures can be very complex topologies with many different technologies in play. When it comes to change management, there are so many devices that rely on each other, that serious thought needs to be given when planning upcoming technical changes whether they are on the network, server, or desktop side of things. Take network changes for instance. Simple additions of routes can fix issues with certain network devices while breaking end to end connectivity for others. Dynamic routing protocols can amplify these minor changes as they are shared between devices. Server environments can have this issue as well. Virtual datacenter changes can affect multiple physical hosts containing a wide range of virtual servers. Again, even minor changes can be amplified to affect a large number of devices and users. The due diligence that going into planning IT changes ensures that you as the admin are fully aware of all devices that will be affected when the changes are made.

 

Be Aware of your Hybrid Environment

 

Local IT changes are one thing. You can make your changes and can always have local "console" access if needed in the event of something going wrong. In hybrid IT environments, this may not always be the case. Remotely hosted servers such as web servers or cloud hosted domain controllers need special consideration when it comes to the administration process. On-premises processes, such as restoring from a backup, can be very different when taking place on a device hosted in the cloud. Being aware of the affected devices for a change, their location, and the details of their management is ever more important as hybrid IT environments are becoming so common.

 

Preparing a Strong Change Management Plan

 

My personal strategy for change management is made up of four particular steps that I always make sure to follow to ensure a smooth change process.

 

  1. Have a documented scope of work.
  2. Communicate the process to all affected parties ahead of time: on-premises and remote.
  3. Complete prep work beforehand where possible.
  4. Always have a backup plan and or a rollback process.

 

Documented scopes of work ensure that everyone is on the same page and all of the steps that need to be accomplished during the change are laid out ahead of time. This ensures everyone is on the same page and all required tasks get taken care of and not forgotten. Once this plan is developed, you can effectively communicate this process to all affected parties. As long as this is communicated in advance other affected users and teams can send any questions or concerns they may have. With maintenance windows getting smaller and smaller prep work can be very beneficial when it comes to the change management process. This can include scripting changes, downloading updates ahead of time, and even scheduling automated tasks. Focusing on these tasks ahead of time can save you valuable time and effort when the window for your change takes place. Lastly, always have a backup plan. This could be as easy as a simple configuration or data backup or even be a bit more involved like a full rollback process. Either way, make sure that if things go south, you have a process laid out that you can follow. This ensures you are never in a situation where there is an “I don’t know what to do” moment, and that’s what is important.

 

The change management process does not need to be a difficult thing no matter what size your organization might be. With a little bit of planning and some attention to detail, you can ensure that your maintenance windows are stress free and go off without a hitch!

By Paul Parker, SolarWinds Federal & National Government Chief Technologist

 

Here is an interesting article from my colleague Joe Kim, in which he explores the impact of software defined networking on agency networks.

 

It’s ironic, but true: software-defined networking (SDN) is tough to define. Perhaps that’s why agency network administrators are still trying to wrap their minds around the concept of SDN, despite its known benefits. They know that the tools will likely make their lives as network managers much easier and provide greater agility, security, and cost-savings. But still they ask: How do I approach SDN and make it work for my agency?

 

SDN implementation can pose significant challenges. Millions of dollars’ worth of legacy network equipment, accumulated over the years and well-integrated into the IT infrastructure, need to be replaced. But there is the potential for a huge payoff on the other side. Adopting SDN sets your agency up for a more efficient future, and lays the groundwork for greater innovation at less cost.

 

Figuring out if now is the time to begin building toward that future should be undertaken in the same manner as other major technology initiatives: through testing and analysis. Before diving into the SDN waters, it’s a good idea to set up a test environment, if possible. Simulate production so you can gain a better understanding of whether SDN is appropriate for your agency.

 

Monitor network performance for better quality of service

 

Two of the big reasons agencies are implementing SDN is to remove the potential for human error and simplify network management. The idea is to make networks more automated to deliver faster, more reliable, and overall better quality of service.

 

Monitoring the network during testing will provide some insight into whether this goal will be achievable through SDN. Tag what is affected by SDN and what is not, and closely track availability and uptime. Use network performance and configuration monitoring tools (which you may already have in your arsenal) to assess your SDN deployment. If you see that SDN is positively impacting uptime, you’ll feel more comfortable making the move.

 

Understand that SDN will cost you, but it could also save you

 

The cost of migrating toward SDN will, of course, vary depending on the agency and the scope of its needs, but one thing is certain: it’s going to be expensive. In addition to requiring additional employee hours, SDN requires a deep analysis to determine necessary hardware updates. Layered on top of that is the purchase price of the SDN solutions themselves. Over time, costs can easily run into the hundreds of thousands of dollars, stretching federal IT budgets.

 

Run some numbers before embarking on a full-scale migration. Ascertain whether the cost of managing SDN is less than the costs incurred managing a manual network environment. Include the cost of set-up time and the processes that are being automated in this analysis. If the numbers work in SDN’s favor, you have a good bottom-line reason for taking the SDN plunge.

 

Understand the risks so you can be prepared for them

 

Even after you’ve decided to move forward, know that SDN migrations can be fraught with risk. Therefore, they should not be done in a wholesale, “big bang” type of manner, but accomplished using a piecemeal, highly thoughtful approach. This makes the migration easier while helping to preserve uptime as much as possible.

 

Even so, sooner or later changes on the core network will inevitably impact certain services, such as switching or routing. When this happens, there will be some downtime — unavoidable when one turns their network over to SDN and begins to rely on changes being made without human intervention. Knowing this in advance, however, can help you plan for it, making it easier to navigate these bumps in the road heading to greater agility and nimbleness.

 

Adopt additional best practices to optimize your SDN

 

There are other best practices you should consider adopting after you’ve begun implementing SDN in your agency. Teams should get certified on SDN or get functional training on how to work in an SDN environment, which is much different than what most professionals are accustomed to managing. Establish a protocol of backing up policies on a regular basis, as opposed to just backing up configurations of network devices. Employ monitoring as an ongoing discipline, continuously and automatically analyzing your network for potential issues that could prove harmful so that you can react to them quickly.

 

Above all, do not be afraid to experiment. Understand that mistakes will inevitably occur, and you will probably fail at least some of the time. Learn from these times. Improve upon processes. Make things better.

 

After all, that’s how we achieve progress. Perhaps for you, that progress will include a future network that is more agile, secure, reliable, and software-defined.

 

Find the full article on our partner DLT’s blog Technically Speaking.

A practice leader denotes an IT professional who cannot only walk the walk and talk the talk; but also build and lead teams to do the same. Practice leaders have to be the calm within the ever-changing storm. Everyone is talking about technologies and forgetting that friction comes from two sources – people and process.

 

Tech is easy, but people are non-linear, differential equations that are hard to solve since they change over time. People influence processes. Think of processes as forces that can be directed positively or negatively by people. People pollute processes with political motives, selfish decisions, and personal biases. These inheritances introduce friction into processes and further affect future personal interactions.

 

Leadership sets the edge for an organization’s culture. It is the driving force for either good or bad. Where will you lead your organization? And will it retain and attract people that move the organization’s culture forward while continuing to win business? Leaders need to have answers to these questions.

 

Becoming a practice leader is one path that an IT professional can take in their career. It combines the technical expertise and experience to deliver services into practice with the ability to communicate and lead through complex scenarios. Is practice leadership a path that you are taking? What are some of the things that have worked for you and your teams? Conversely, what are some of the things that have not worked? Let me know in the comment section.

The battle of the legends has come to an end.

Though we started with 33 only one could ascend.

Our winner is a beast who fights fire with flames.

Puff, Maleficent, Smaug, & Toothless are a few of the famed.

Hundreds of you jumped on the bandwagon,

The winner of your votes was none other than the Dragon!

 

Dragon won the final round claws down!

Nessie only managed to win 22% of the vote, and was a clear underdog from the start of this battle royal.

Dragon was a force of nature throughout this bracket and easily extinguished the competition each round.

 

Here’s a look back at Dragon’s other bracket victories:

Fairy Tales Round 1: Dragon vs. Unicorn

Fairy tales round 2: Leprechaun vs Dragon

Fairy tales round 3: Dragon vs Phoenix

Gruesomes vs Fairy Tales Round 4: Kraken vs Dragon

 

What are your final thoughts on this year’s bracket?

 

Do you have any bracket theme ideas for next year?

 

Tell us below!

This week's Actuator comes to you from Austin, where I am visiting HQ for a few days in an effort to escape the cold Spring we are having in New England. Here's hoping the cold doesn't follow me like last time when I brought that ice storm.

 

As always, here are some links from the Intertubz that I hope will hold your interest. Enjoy!

 

Let's Stop Giving Retailers a Free Pass on Data Breaches

This. Companies will never change until they have a financial incentive to change.

 

Stop Talking About IoT Security And Do Something About It

Maybe we could stop writing blog posts about IoT security and start providing lists of companies for us to avoid (see above).

 

Microsoft plans to invest $5 billion in IoT over the next 4 years globally

And there is the first shoe dropping, as a major cloud provider pushes a truckload of cash into IoT. I am certain that Microsoft will spend dollars on improving IoT security.

 

The space race is over and SpaceX won

This is great news for Elon, and it’s likely going to serve as a distraction from the current financial status of Tesla.

 

MIT is making a device that can 'hear' the words you say silently

This is a horrible idea. Someone needs to tell MIT to shut this down. Maybe if we all think it, they will get the message.

 

Facebook reported in 7 countries for breaking European privacy law

And just when Zuck thought things couldn’t get worse.

 

T-Mobile Austria is OK with Storing Passwords Partly in Clear Text

Just when you thought we had moved beyond corporations doing dumb things in public, T-Mobile in Austria wants you to know that they are also amazingly good at social media.

 

I like this approach to GDPR compliance:

By Paul Parker, SolarWinds Federal & National Government Chief Technologist

 

Here is an interesting article from my colleague Joe Kim, in which he points out the shift in responsibilities caused by hybrid IT.

 

Moving to a hybrid environment, where part of your infrastructure is in the cloud while the rest of it remains on-premises, may require a far greater shift in responsibilities for the federal IT team than anticipated.

 

In a traditional on-premises environment, the federal IT manager needs three things to be successful: responsibility, accountability, and authority.

 

In a hybrid IT environment, the federal IT manager is still responsible and accountable. However, part of the cloud dynamic is that a manager’s level of authority and control will vary depending on the cloud provider and its offerings. But what about authority over the network you use to access cloud resources? A carrier’s network usually won’t give you the authority to make changes.

 

Here’s where the issue of visibility comes in. The only way to mitigate the loss of pure authority over a hybrid network is to have visibility into the details of its performance and health. This visibility is key when troubleshooting and dealing with service providers and carriers who, when service is slow or has failed, often revert to the default answer, “Everything looks fine on this end.”

 

The importance of visibility

 

Why is visibility key? Because carriers are not always up front about what they’re seeing, or whether they’re even looking into a slowdown within your infrastructure.

 

Let’s say your service is not responding. A data center manager can check the internal network, and it will probably be just fine. That manager can then call the Software as a Service (SaaS) provider, who will likely say everything is fine on their end as well. The next step is to initiate a support call to the internet service provider (ISP) to find out whether or not the problem is somewhere in the middle, within the service provider’s realm.

 

The support person will likely say, “Everything looks fine here,” which is a challenge. Exacerbating the challenge is the federal IT manager’s inability to see into the service provider’s network.

 

The federal IT manager must be able to see any latency introduced by any device as packets flow through it. This information, both for the current state and historical usage, will show where the packets are going once they leave your premises, as well as how fast they’re traveling.

 

Complete visibility—a necessity for a successful hybrid IT transition—comes in the form of IT monitoring tools that provide a view of your entire environment: on-premises, in the cloud, and everything in between. These tools must be able to show a variety of device types (routers, load balancers, storage, servers, etc.) from a range of vendors.

 

Two last pieces of advice. First, be sure the IT monitoring tools you choose account for the virtual layer, whether it’s virtual servers or virtual networking, as much as the physical layer. Second, because your IT environment will only grow larger and more complex as it extends further into the cloud, the tools you select must be able to scale with the number and type of devices.

 

Find the full article on Federal Technology Insider.

I was off last week to celebrate Pesach / Passover so I thought it would be a good time to offer you a taste of an upcoming eBook I'm working on, "The Four Questions of Monitoring," which uses that holiday both as its inspiration and as a thematic framework. I'll be publishing snippets of it here and there.

 

**************************

(image courtesy of Manta)

 

Once a year, Jews around the world gather together to celebrate Pesach (also known as "Passover,” "The Feast of Matzah,” or even "The Feast of the Paschal Lamb”). More a ceremonial meal than actual "feast,” this gathering of family and friends can last until the wee hours of the morning. The dinnertime dialogue follows a prescribed order (or "seder,” which actually means "order" in Hebrew) that runs the gamut from leader-led prayers to storytelling to group singalongs to question-and-answer sessions and even—in some households—a dramatized retelling of the exodus narrative replete with jumping rubber frogs, ping-pong ball hail stones, and wild animal masks.

 

At the heart of it all, the Seder is designed to do exactly one thing: to get the people at the table to ask questions. Questions like, "Why do we do that? What does this mean? Where did this tradition come from?" To emphasize: the Seder is not meant to answer questions, but rather provoke them.

 

As a religion, Judaism seems to love questions as much (or more) than the explanations, debates, and discussions they lead to. I'm fond of telling co-workers that the answer to any question about Judaism begins with the words, "Well, that depends..." and ends two hours later when you have three more questions than when you started.

 

The fact that I grew up in an environment with such fondness for questions may be what led me to pursue a career in IT, and to specialize in monitoring. More on that in a bit.

 

But the ability to ask questions is nothing by itself. An old proverb says, "One fool can ask more questions than seven wise men can answer." And that brings me back to the Pesach Seder. Near the start of the Seder meal, the youngest person at the table is invited to ask the Four Questions. They begin with question, "Why is this night different from all other nights?" The conversation proceeds to observe some of the ways that the Pesach meal has taken a normal mealtime practice and changed it so that it's off-kilter, abnormal, noticeably (and sometimes shockingly) different.

 

Like many Jewish traditions, there is a simple answer to the Four Questions. At the surface, it's done to demonstrate to children that questions are always welcome. It's a way of inviting everyone at the table to take stock of what is happening and ask about anything unfamiliar. But it doesn't stop there. If you dig just a bit beneath that easy surface reasoning you'll find additional meaning that goes surprisingly deep.

 

In Yeshivah — a day-school system for Jewish children that combines secular and religious learning — the highest praise one can receive is, "Du fregst un gut kasha," which translates as, "You ask a good question.”

 

This is proven out in a story told by Rabbi Abraham Twersky, a deeply religious psychiatrist. He says that when he was young, his teacher would relish challenges to his arguments. In his broken English, the teacher would say, “You right! You 100 prozent right!! Now, I show you where you wrong!”

 

The impact of this culture of questioning does not limit itself to religious thinking. Individuals who study in this system find that it extends to all areas of life, including the secular.

 

When asked why he became a scientist, Isidor I. Rabi, the Nobel laureate in physics, answered,

''My mother made me a scientist without ever intending it. Every other mother in Brooklyn would ask her child after school, 'So? Did you learn anything today?' But not my mother. She always asked me, 'Did you ask a good question today?' That difference—asking good questions—made me become a scientist!''

 

The lesson for us, as monitoring professionals, is twofold. First, we need to foster that same sense of curiosity, that same willingness to ask questions, even when we think the answers may be a long time in coming. We need to question our own assumptions. We need to relish the experience of asking so that it pushes us past the inertia of owning an answer, which is comfortable. And second, we need to find ways to invite questions from our colleagues, as well. Like the Seder, we may have to present information in a way that is shocking, noticeable, and engaging, so that people are pushed beyond their own inherent shyness (or even apathy) to ask, "What is THAT all about?”

 

The deeper message of the Passover seder speaks to the core nature of questions, and the responsibility of those who attempt to answer. "Be prepared,” it seems to say. "Questions can come from anywhere, about anything. Be willing to listen. Be willing to think before you speak. Be willing to say, 'I don't know, but let's find out!' You must also be willing to look past trite answers. Be ready to reconsider, and to defend your position with facts. Be prepared to switch, at a moment’s notice, from someone who answers, to someone who asks."

 

Once again, I believe that being exposed to this tradition of open honesty and curiosity is what makes the discipline of monitoring resonate for me.

Before we get into the results of round 4, I have to give a shout-out to the only two THWACKsters to correctly guess the final 4 in this year’s bracket battle. Congrats to ebradford & jessem! You have been awarded the 1,000 bonus THWACK points. There were several of you who got 3 out of 4 correct and I must say I’m impressed! Personally, I voted with my heart instead of my head and was #bracketbusted before we even got started…

 

Ahem, anyways… let’s look at who’s moving on to the championship round from the final 4!

 

 

Does anyone else share sparda963's feelings about this round? “I don't even know what is going on anymore. This bracket is all out of wack. Up is down, Down is pink, Blue tastes like cherry! It's madness!!!!”

 

For the last time this year, it’s time to check out the updated bracket and vote for the greatest legend of all time! This is it folks!

 

You will have until Sunday, April 8th @ 11:59 PM CDT to submit your votes & campaign for your favorite legend. Don't forget to share your thoughts and predictions for the winner on social #SWBracketBattle!

 

We’ll post a final recap and announce the winner on April 11th.

 

Access the bracket and place your vote for the winner HERE>>

I hope everyone enjoyed a nice holiday weekend with friends and family. This week marks the 2nd anniversary for The Actuator. Every week, for two years, I've produced an odd assortment of links for you to enjoy. Thank you for taking the time to read them. Here's to the next two years.

 

As always, here are some links from the Intertubz that I hope will hold your interest. Enjoy!

 

Former Uber Backup Driver: 'We Saw This Coming'

Given Uber's track record in, well, just about everything, I guess we should have seen this coming. And I'm not talking about the accident. No, I mean the number of people looking to pile on Uber right now and remind everyone that Uber isn't the best-run company.

 

Microsoft starts rolling out Azure Availability Zones for datacenter failure protection

One of the differences between Azure and AWS data center architecture has to do with availability zones. Microsoft is closing that gap, fast. It won't be long before Azure and AWS are nearly identical in services.

 

Announcing 1.1.1.1: the fastest, privacy-first consumer DNS service

Cloudflare is offering free DNS service. I applaud the effort, but I remain skeptical of any company that provides a service such as DNS for free. It's all about the data, folks. If you use the internet, someone is tracking your data, for one reason or another.

 

Using Machine Learning to Improve Streaming Quality at Netflix

Another brilliant piece from the Netflix blog, this time showing a practical use case for machine learning and network streaming quality. So, the next time someone wants to know about a practical use case for machine learning, I'm going to show them this.

 

Machine Learning for kids

And since I'm talking about machine learning, here's a great website to help kids (or anyone) get started. You might want to take some data from your favorite monitoring tool and use it in one of the projects here. Who knows, you may be able to build a model that can predict the next time Brad is about to drop a production database... again.

 

Georgia Passes Anti-Infosec Legislation

From the state where the capital city government allowed itself to be attacked by ransomware virus that was two years old, you are now forbidden to test websites for security flaws. Suddenly I understand why Atlanta was held for ransom.

 

It's April and snowing so I need to remind everyone that spring is just around the corner:

 

Here's an interesting article from my colleague Joe Kim, in which he offers suggestions to reduce vulnerabilities.

 

Agencies should focus on the basics to protect against attacks

 

The government’s effort to balance cybersecurity with continued innovation was underscored in late 2016 with the publication of the Commission on Enhancing National Cybersecurity’s Report on Securing and Growing the Digital Economy. The report included key recommendations for cybersecurity enhancements, while also serving as a sobering reminder that “many organizations and individuals still fail to do the basics” when it comes to security.

 

But in today’s environment, agencies must focus on some basic but highly effective fundamentals to protect against potential attackers. Some of these involve simple and basic security hygiene and others require more of an investment, both in terms of capital and human resources, and long-range thinking.

 

Let’s take a look at five fundamental strategies that can help agencies build an advanced and solid security posture.

 

Embrace network modernization

 

The report says, “The President and Congress should promote technology adoption and accelerate the pace at which technology is refreshed within the federal sector … the government needs to modernize and ensure that this modernization can be sustained at a faster pace.”

 

Modern network technologies are better equipped to handle cyberattacks, are often easier to manage, and are more efficient. Most can work in any environment and adapt to changing threat conditions. They can also automatically detect and respond to potential attacks without the need for human intervention, mitigating the threats before damage occurs. 

 

Modernization often leads to standardization, which means fewer device types and configurations to manage. This reduces vulnerability, because configurations can be refined, deployed, and maintained more easily.

 

Implement continuous monitoring

 

The commission states that “a security team has to protect thousands of devices while a malicious actor needs to gain access to only one.” This makes automated continuous monitoring extremely important.

 

A proper continuous monitoring solution contains a variety of components working together to strengthen an agency’s defenses against many attack methods. Those solutions could include log and event management tools that track login failures and make it easier to spot potential security incidents; device tracking solutions that can detect unauthorized network devices; or network configuration management solutions that can improve network compliance and device security. All of these can be done without human intervention, and most can be easily updated.

 

Remember to patch

 

Keeping software current with the latest patches and updates is an important threat deterrent, and almost impossible to do manually, given the amount of software that powers federal networks.

 

Automated patch management tools can analyze various software programs and scan for known vulnerabilities and available updates. These updates can be automatically applied as they become available, keeping software up-to-date and well-fortified against the latest threats.

 

Implement strong encryption

 

In the words of Edward Snowden, “Properly implemented strong encryption systems are one of the few things that you can rely on.” However, ensuring the security of data at rest and in flight is not necessarily an easy task, considering the hybrid cloud and IT environments that many agencies have adopted.

 

Still, strong encryption protocols must remain in place regardless of where the data resides, and data that travels from a hosted site must receive the same level of encryption—or, perhaps an even greater level of encryption—than data that exists on-premises. The slightest vulnerability in an unencrypted network can be a window to cyber attackers, while solid, end-to-end encryption remains extremely difficult to penetrate, regardless of where the data exists.

 

Adopt the Cybersecurity Framework

 

While many agencies have adopted the NIST Cybersecurity Framework, there’s room for more to get on board. There are signs that the government plans to increase use and is working to ensure the framework’s continued growth. In March, the House Committee on Science, Space, and Technology passed a bill designed to encourage adoption of the framework.

 

This shows how serious the government is about balancing proactive cybersecurity with innovative technology. Agencies can support this effort by combining a few basic strategies with some long-term investments that will ultimately pay big security dividends.

 

Find the full article on SIGNAL.

As predicted, each matchup in the elite 8 had me on the edge of my seat! The vote was nearly evenly split in each battle making this one of the closest races to the final 4 we’ve ever seen.

 

Here’s a look at who’s moving on from this round:

 

  • Cryptids round 3: Thunderbird vs Loch Ness Monster Easily one of the most surprising outcomes of this round, Nessie shocks everyone and swims away with this one! Thunderbird had a lot of support in the comment section because of its electric abilities, but ebradford had a solid argument for why Nessie should win, “...Of course, this contest isn't really fair since a Thunderbird is fictional, and Nessie is real.” 
  • Half & Halfs round 3: Griffin vs Minotaur This was stacking up to be a close race, but in the end it was Griffin’s fly skillz that tipped the scale in its favor. rschroeder “…Traditionally, the Minotaur is always defeated in stories.  Not so the Griffin, which attains a nobility and seems to be a higher, more enlightened entity than a Minotaur. Will the body of a lion, with its powerful legs and long, sharp claws, combined with the strong feet and talons and beak of an eagle, be weapons superior to the bovine horns and human arms and legs of the Minotaur? I think so. This one should go to the Griffin.”
  • Gruesomes round 3: Medusa vs Kraken Without question the biggest rivalry of the whole bracket battle, Kraken had a lot to prove this round given its history with Medusa. The comment section showed a lot of support for Medusa as she had easily won in the Clash of the Titans—not once, but twice! For the first time in this bracket battle the underdog fish came away with the W!!!
  • Fairy tales round 3: Dragon vs Phoenix This battle was on fire! Though the Phoenix possesses the ability to rise from the ashes every time the Dragon unleashes another attack, it wasn’t enough to secure the win and move on to the next round. I’m sure a lot of brackets were busted over this one!

 

Were you surprised by any of the winners this round? Comment below!

 

It’s time to check out the updated bracket & start voting in the ‘Mythical’ round! We need your help & input as we get one step closer to crowning the ultimate legend!

 

Access the bracket and make your picks HERE>>

 

I can’t wait to see who the community picks to face off in the final round!

Filter Blog

By date: By tag:

SolarWinds uses cookies on its websites to make your online experience easier and better. By using our website, you consent to our use of cookies. For more information on cookies, see our cookie policy.