Skip navigation
1 14 15 16 17 18 Previous Next

Geek Speak

1,984 posts

convention2.png

As I mentioned a while ago, I've returned to the world of the convention circuit after a decades-long hiatus. As such, I find I'm able to approach events like Cisco Live and Interop with eyes that are both experienced ("I installed Slack from 5.25 floppies, kid. Your Linux distro isn't going to blow my socks off,") and new (“The last time I was in Las Vegas, The Luxor was the hot new property”).

 

This means I'm still coming up to speed on tricks of the trade show circuit. Last week I talked about the technology and ideas I learned. But here are some of the things I learned while attending Interop 2016. Feel free to add your own lessons in the comments below.

 

  • A lot of shows hand you a backpack when you register. While this bag will probably not replace your $40 ThinkGeek Bag of Holding, it is sufficient to carry around a day's worth of snacks, plus the swag you pick up at vendor booths. But some shows don’t offer a bag. After a 20-minute walk from my hotel to the conference, I discovered Interop was the latter kind.
    LESSON: Bring your own bag. Even if you're wrong, you'll have a bag to carry your bag in.
  • What happens in Vegas – especially when it comes to your money – is intended to stay in Vegas. I'm not saying don't have a good time (within the limits of the law and your own moral compass), but remember that everything about Las Vegas is designed to separate you from your hard-earned cash. This is where your hard-won IT pro skepticism can be your superpower. Be smart about your spending. Take Uber instead of cabs. Bulk up on the conference-provided lunch, etc.
    LESSON: As one Uber driver told me, "IT guys come to Vegas with one shirt and a $20 bill and don't change either one all week."
  • Stay hydrated. Between the elevation, the desert, the air conditioning, and the back-to-back schedule, it's easy to forget your basic I/O subroutines. This can lead to headaches, burnout, and fatigue that you don't otherwise need to suffer.
    LESSON: Make sure your bag (see above) always has a bottle of water in it, and take advantage of every break in your schedule to keep it topped off.
  • Be flexible, Part 1. No, I'm not talking about the 8am Yoga & SDN Session. I mean that things happen. Sessions are overbooked, speakers cancel at the last minute, or a topic just isn't as engaging as you thought it would be.
    LESSON: Make sure every scheduled block on your calendar has a Plan B option that will allow you to switch quickly with minimal churn.
  • Be flexible, Part 2. As I said, things happen. While it's easy in hindsight (and sometimes in real-time), to see the mistake, planning one of these events is a herculean task with thousands of moving parts (you being one of them). Remember that the convention organizers are truly doing their best. Of course, you should let staff know about any issues you are having, and be clear, direct, and honest. But griping, bullying, or making your frustration ABUNDANTLY CLEAR is likely not going to help the organizers regroup and find a solution.
    LESSON: Instead of complaining, offer suggestions. In fact, offer to help! That could be as simple as saying, "I see your room is full. If you let me in, I'll Periscope it from the back and people in the hall can watch remotely." They might not take you up on your offer, but your suggestion could give them the idea to run a live video feed to a room next door. (True story.)
  • VPN or bust. I used to be able to say, "You’re going to a tech conference and some savvy person might..." That's no longer the case. Now it is, "You are leaving your home/office network. Anybody could..." You want to make sure you are being smart about your technology.
    LESSON: Make sure every connected device uses a VPN 100% of the time. Keep track of your devices. Don't turn on radios (Bluetooth
    , Wi-Fi, etc.) that you don't need and/or can't protect.
  • Don't bail. You are already in the room, in a comfortable seat, ready to take notes. Just because every other sentence isn't a tweetable gem, or because you feel a little out of your depth (or above it), doesn't mean the session will have nothing to offer. Your best interaction may come from a question you (or one of the other attendees) ask, or a side conversation you strike up with people in your area.
    LESSON: Sticking out a session is almost always a better choice than bailing early.
  • Tune in. Many of us get caught up in the social media frenzy surrounding the conference, and have the urge to tweet out every idea as it occurs to you. Resist that urge. Take notes now – maybe even with pen and paper – and tweet later. A thoughtfully crafted post on social media later is worth 10 half-baked live tweets now.
    LESSON: You aren't working for the Daily Planet. You don't have to scoop the competition.
  • Pre-game. No, I'm not talking about the after-party. I mean make sure you are ready for each session prior to each session. Have your note-taking system (whether that's paper and pen, Evernote, or email), preloaded with the session title, the speaker name, and related info (Twitter handle, etc.), and even a list of potential going-in questions (if you have them). It will save you from scrambling to capture things as they slide off the screen later.
    LESSON: Ten minutes prepping the night before is worth the carpal tunnel you avoid the following day.
  • Yes, you have time for a survey. After a session, you may receive either an electronic or hard copy survey. Trust me, you aren't too busy to fill it out. Without this feedback, organizers and speakers have no way of improving and providing you with a better experience next time.
    LESSON: Take a minute, be thoughtful, be honest, and remember to thank people for their effort, in addition to offering constructive criticism.

 

Do you have any words of advice for future conference attendees? Do you take issue with anything I’ve said above? I’d love to hear your thoughts! Leave a note in the comments below and let’s talk about it!

I remember a dark time in my life when I didn't know where I was going. I scrambled to find direction but I couldn't understand the way forward. It was like I was lost. Then, that magic moment came. I found the path to my destination. All thanks to GPS.

It's hard to imagine the time before we had satellite navigation systems and very accurate maps that could pinpoint our location. We've come to rely on GPS and the apps that use it quite a bit to find out where we need to go. Gone are the huge road atlases. Replacing them are smart phone and GPS receivers that are worlds better than the paper of yesteryear.

But even GPS has limitations. It can tell you where you are and where you need to be. It can even tell you the best way to get there based on algorithms that find the fastest route. But what if that fastest route isn't so fast any more? Things like road construction and traffic conditions can make the eight-lane super highway slower than a one-lane country road. GPS is infinitely more useful when it is updated with fresh information about the best route to a destination for a given point in time.

Let's use GPS as a metaphor for your network. You likely have very accurate maps of traffic flows inside your network. You can tell which path traffic is going to take at a given time. You can even plan for failure of a primary link. But how do you know that something like this occurred? Can you tell at a moment's notice that something isn't right and you need to take action? Can you figure it out before your users come calling to find out why everything is running slow?

How about the traffic conditions outside your local or data center network? What happens when the links to your branch offices are running suboptimally? Would you know what to say to your provider to get that link running again? Could you draw a bullseye on a map to say this particular node is the problem? That's the kind of information that service providers will bend over backwards to get from you to help meet their SLAs.

This is the kind of solution that we need. We need visibility into the network and how it's behaving instantly. We need to know where the issues are before they become real problems. We need to know how to keep things running smoothly for everyone so every trip down the network is as pleasant as an afternoon trip down the highway.

If you read through this entire post nodding your head and wanting a solution just like this, stayed tuned. My GPS tells me your destination is right around the corner.

sqlrockstar

The Actuator - May 18th

Posted by sqlrockstar Employee May 18, 2016

I am in Redmond this week to take part in the SQL Server® 2016 Reviewer's Workshop. Microsoft® gathers a handful of folks into a room and we review details of the upcoming release of SQL Server (available June 1st). I'm fortunate to be on the list so I make a point of attending when asked. I'll have more details to share later, but for now let's focus on the things I find amusing from around the internet...

 

How do you dispose of three Petabytes of disk?

And I thought having to do a few of these for friends and family was a pain, I can't imagine having to destroy this many disks. BTW, this might be a good time to remind everyone that data can never be created or destroyed, but it most certainly can be lost or stolen.

 

The Top Five Reasons Your Application Will Fail

Not a bad list, but the author forgot to list "crappy code, pushed out in a hurry, because agile is an excuse to be sloppy". No, I'm not bitter.

 

Audit: IT problems with TSA airport screening equipment persist

"The TSA's lack of server updates and poor oversight caused a plethora of IT security problems". Fortunately no one has any idea how many problems are in a plethora. Also? I know a company that makes tools to help fix such issues.

 

AWS Discovery Service Aims To Ease Legacy Migration Pain

Something tells me this tool is going to cause more pain when companies start to see just how much work needs to be done to migrate anything.

 

Bill Gates’ open letter

Wonderful article on how much the software industry has changed over the past 40 years. It will keep changing, too. I see the Cloud as a way for the software industry to change their licensing model from feature driven (Enterprise, Standard, etc.) to one driven by scalability and performance.

 

How to Reuse Waste Heat from Data Centers Intelligently

While this might sound good to someone, the reality is the majority of companies in the world do not have the luxury of building a data center from scratch, or even renovating existing ones. Still, it's interesting to understand just how much electricity data centers consume, and understand that the power has to come from somewhere.

 

How Much Does the Xbox One’s “Energy Saving” Mode Really Save?

Since we're talking about power usage, here's a nice example to help us understand how much extra it costs us to keep our Xbox always on. If it seems cheap to you then you'll understand how the cost of a data center may seem cheap to a company.

 

This week marks the fifth anniversary of my seeing the final launch of Endeavour, so I wanted to share something related to STS-134:

LaRockLaunch.jpg

 

Lastly, if you've been enjoying The Actuator please like, share, and/or comment. Thanks!

In the past few years, there has been a lot of conversation around the “hypervisor becoming a commodity." It has been said that the underlying virtualization engines, whether they be ESXi, Hyper-V, KVM etc. are essentially insignificant, stressing the importance of the management and automation tools that sit on top of them.

 

These statements do hold some truthfulness: in its basic form, the hypervisor simply runs a virtual machine. As long as end-users have the performance they need, there's nothing else to worry about. In truth, though, the three major hypervisors on the market today (ESXi, Hyper-V, KVM) do this, and they do it well, so I can see how the “hypervisor becoming a commodity” works in these cases. But to SysAdmins, the people managing everything behind the VM, the commoditized hypervisor theory isn't bought quite so easily.

 

When we think about the word commodity in terms of IT, it’s usually defined as a product or service that is indistinguishable to it’s competitors, except for maybe price. With that said, if the hypervisors were a commodity, we shouldn’t care what hypervisor our applications are running on. We should see no difference between the VMs that are sitting inside an ESXi cluster or a Hyper-V cluster. In fact, in order to be commodity, these VMs should be able to migrate between hypervisors. The fact is that VMs today are not interchangeable between hypervisors, at least not without changing their underlying anatomy. While it is possible to migrate between hypervisors, the fact of the matter is that there is a process that we have to follow, including configurations, disks, etc. The files that make up that VM are all proprietary to the hypervisor they are running on and cannot simply be migrated and run by another hypervisor in their native forms.

 

Also, we stressed earlier the importance of the management tools that lie above the hypervisor, and how the hypervisor didn’t matter as much as the management tools did. This is partly true. The management and automation tools put in place are the heart of our virtual infrastructures, but the problem is that these management tools often create a divide in the features they support on different hypervisors. Take, for instance, a storage array providing support for VVOLs, VMware’s answer to per-vm-based policy storage provisioning. This is a standard that allows us to completely change the way we deploy storage, eliminating LUNs and making VMs and their disk first-class citizens on their subsequent storage arrays. That said, these are storage arrays that are connected to ESXi hosts, not Hyper-V hosts.  Another example, this time in favor of Microsoft, is in the hybrid cloud space. With Azure stack coming down the pipe, organizations will be able to easily deploy and deliver services from their own data centers, but with azure-like agility. The VMware solution, which is similar, involving vCloud Air and vCloud Connector, is simply not at the same level as Azure when it comes to simplicity, in my opinion. They are two very different feature-sets that are only available on their respective hypervisors.

 

So with all that, is the hypervisor a commodity?  My take: No! While all the major hypervisors on the market today do one thing – virtualize x86 instructions and provide abstraction to the VMs running on top of them - there are simply two many discrepancies between the compatible 3rd-party tools, features, and products that manage these hypervisors for me to call them commoditized. So I’ll leave you with a few questions. Do you think the hypervisor is a commodity?  When/if the hypervisor fully becomes a commodity, what do you foresee our virtual environments looking like? Single or multi-hypervisor? Looking forward to your comments.

The other day we were discussing the fine points of running an IT Organization and the influence of People, Process and Technology on Systems Management and Administration, and someone brought up one of their experiences.   Management was frustrated at how it would take days for snapshots on their storage and virtualization platform was looking to replace their storage platform to solve this problem.  Clearly as this was a technology problem they sought out a solution which would tackle this and address the technology needs of their organization!  Chances are one or more of us have been in this situation before, so they did the proper thing and looked at the solutions!  Vendors were brought in, solutions spec’d, technical requirements were established and features were vetted.  Every vendor was given the hard and fast requirements of “must be able to take snapshots in seconds and present to the operating system to use in a writable fashion”.  Once all of the options were reviewed, confirmed, demo’d and validated they had made a solid solution!

 

Months followed as they migrated off of their existing storage platform onto this new platform, the light at the end of the tunnel was there, the panacea to all of their problems was in sight! And finally, they were done. Old storage system was decommissioned and the new storage system was put in place.  Management patted themselves on the back and they went about dealing with their next project, first and foremost on that list was the instantiation of a new Dev environment which would be based off of their production SAP data.   This being a pretty reasonable request they proceeded following their standard protocol to get it stood up, snapshots taken and presented.  Several days later their snapshot was presented as requested to the SAP team in order to stand up this Dev landscape.  And management was up in arms!

 

What exactly went wrong here? Clearly a technology problem had existed for the organization and a technology solution was delivered to act on those requirements.   Yet had they taken a step back for a moment and looked at the problem for it’s cause and not its symptoms they would have noticed that their internal SLAs and processes are really what was at fault, not the choice of technology.  Don’t get me wrong, some technology truly is at fault and a new technology can solve it, but to say that is the answer to every problem would be untrue, and some issues need to be looked at in the big picture.   To give you the true cause of their problem as their original storage platform COULD have met the requirements; was their ticketing process required multiple sign-offs for Change Advisory Board Management, approval and authorization, and the SLAs given to the storage team involved a 48-hour response time.  In this particular scenario the Storage Admins were actually pretty excited to present the snapshot so instead of waiting until the 48th hour to deliver, the provided it within seconds of the ticket making it into their queue.

 

Does this story sound familiar to you or your organization? Feel free to share some of your own personal experiences where one aspect of People, Process or Technology was blamed for the lack of agility in an organization and how you (hopefully) were able to overcome it?  I’ll do my best to share some other examples, stories and morals over these coming weeks!

 

I look forward to hearing your stories!

It was all about the network

 

In the past, when we thought about IT, we primarily thought about the network. When we couldn’t get email or access the Internet, we’d blame the network. We would talk about network complexity and look at influencers such as the number of devices, the number of routes data could take, or the available bandwidth.

 

As a result of this thinking, a myriad of monitoring tools were developed to help the network engineer keep an eye on the availability and performance of their networks and they provided basic network monitoring.

 

It’s now all about the service

 

Today, federal agencies cannot function without their IT systems being operational. It’s about providing critical services that will improve productivity, efficiency, and accuracy in decision making and mission execution. IT needs to ensure the performance and delivery of the application or service, and understand the application delivery chain.

 

Advanced monitoring tools for servers, storage, databases, applications, and virtualization are widely available to help diagnose and troubleshoot the performance of these services, but one fact remains: the delivery of these services relies on the performance and availability of the network. And without these critical IT services, the agency’s mission is at risk.

 

Essential monitoring for today’s complex IT infrastructure

 

Users expect to be able to connect anywhere and from anything. Add to that, IT needs to manage legacy physical servers, new virtual servers, and cloud infrastructure as well as cloud-based applications and services, and it is easy to see why basic monitoring simply isn’t enough. This growing complexity requires advanced monitoring capabilities that every IT organization should invest in.

 

Application-aware network performance monitoring provides visibility into the performance of applications and services as a result of network performance by tapping into the data provided by deep packet inspection and analysis.

 

With proactive capacity forecasting, alerting, and reporting, IT pros can easily plan for future needs, making sure that forecasting is based on dynamic baselines and actual usage instead of guesses.

 

Intelligent topology-aware alerts with downstream alert suppression will dramatically reduce the noise and accelerate troubleshooting.

 

Dynamic real-time maps provide a visual representation of a network with performance metrics and link utilization. And with the prevalence of wireless networks, adding wireless network heat maps is an absolute must to understand wireless coverage and ensure that employees can reach critical information wherever they are.

 

Current and detailed information about the network’s availability and performance should be a top priority for IT pros across the government. However, federal IT pros and the networks that they manage are responsible for delivering services and data that ensure that critical missions around the world are successful and that services are available to all citizens whenever they need them. This is no small task. Each network monitoring technique I discussed provides a wealth of data that federal IT pros can use to detect, diagnose, and resolve network performance problems and outages before they impact missions and services that are vital to the country.

 

Find the full article on our partner DLT’s blog, TechnicallySpeaking.

The increasing rate of change in applications and its amplitude footprint are causing a lot of consternation within IT organizations. It’s no coincidence, either, since everything revolves around the application, which is innovation personified. It’s the revenue-generating, value-added differentiation, and it's potentially an industry game changer. Think Uber, Facebook, Netflix, Airbnb, Amazon, and Alibaba.

 

Accordingly, the rate and scale of change are products in the application lifecycle. For instance, applications deployed in a virtualization stack will live for months or years, while applications deployed in a cloud stack will live for hours or weeks. Applications deployed in containers or with microservices will live for microseconds or milliseconds.

AppLifeCycle.png

From my Interop 2016 DART Framework presentation.

 

For IT professionals, it’s good to know where job security is. As such, I’ve been keeping monthly tabs of the number of jobs with the key words virtualization, cloud, or (containers AND microservices), on dice.com. In the past year, since June 2015, the number of jobs with the key word "virtualization" has remained flat with around 2600 job openings. In that same time frame, the number of cloud jobs has increased by over 30% to 8900 job openings, while the number of container/microservices jobs has more than doubled, reflecting almost 600 job openings.

 

These trends re-affirm the hybrid IT paradigm and the need to deal efficiently and effectively with change in their application ecosystem. Let me know what you think in the comment section below.

The vast majority of my customers are highly virtualized, and quite potentially using Amazon or Azure in a shadow IT kind of approach. Some groups within the organization have deployed workloads into these large public provider spaces. It’s simply due to these groups having the need to gain access to resources and deploy them as rapidly as possible.

 

Certainly Development and Testing groups have been building systems, and destroying them as testing moves forward toward production. But also, marketing, and other groups may find that the IT team is less than agile in providing these services on a timely basis. Thus, a credit card is swiped, and development occurs. The first indication that these things are taking place is when the bills come.

 

Often, the best solution is a shared environment in which certain workloads deployed into AWS, Azure or even Softlayer, into peer data centers for a shared, but less public workload provide ideal circumstances for the organization.

 

Certainly these services are quite valuable to organizations. But, is it secure, or does it potentially expose the company to vulnerabilities of data and/or potentially an entrée into the corporate network? Are there compliance issues? How about the costs? If your organization could provide these services in a way that would satisfy the user community, would that be a more efficient, cost-effective, compliant, and consistent platform?

 

These are really significant questions. The answers rarely, though, are simple. Today, there are applications, such as Cloudgenera which will analyze the new workload and advise the analyst as to whether any of these issues are significant. It’ll also advise as to current cost models to prove out the costs over time. Having that knowledge prior to deployment could be the difference between agility and vulnerability.

 

Another issue to be addressed with opening your environment up to a hybrid or public workload is the learning curve of adopting a new paradigm within your IT group. This can be daunting. To address these kinds of shifts in approach, a new world of public ecosystem partners have emerged. These tools, create workload deployment methodologies that bridge the gap between your internal virtual environment, and ease or even facilitate that transition. Tools like Platform9’s create what is essentially a software tool that allows the administrator to decide from within vCenter’s Platform9 panel where to deploy that workload. The deployment of this tool is as simple as downloading an OVF, and deploying it into your vCenter. Platform9 leverages the VMware API’s and the AWS API’s to integrate seamlessly into both worlds. Simple, elegant, and learning curve is minimal.

 

There are other avenues to be addressed, of course. For example, what about latencies to the community? Are there storage latencies? Network latencies? How about security concerns?

 

Well, analytics against these workloads as well as those within your virtual environment will no longer be a nice-to-have, but actually a must-have.

 

Lately, I’ve become particularly enthralled with the sheer level of log detail provided by Splunk. There are many SIEM (Security Information and Event Management) tools out there, but in my experience, no other tool gives the functional use as Splunk does. To be sure, other tools, like SolarWinds provide this level of analytics as well, and do so with aplomb. Splunk, as a data collector is unparalleled, but beyond that, the ability to tailor your dashboards to show you the trends, analytics, and pertinent data against all of that volume of data in a functional at-a-glance method. The tool’s ability to stretch itself to all your workloads, security, thresholds, etc., and to present it in such a way that the monitor panel or dashboard can show you so simply where your issues and anomalies lie.

 

There is a large OpenSource community of SIEM software as well. Tools such as OSSIM, Snort, OpenVAS and BackTrack are all viable options, but remember, as OpenSource, they rarely provide the robust dashboards that SolarWinds or Splunk do. They will, as OpenSource, cost far less, but may require much more hand-holding, and support will likely be far less functional.

 

When I was starting out in the pre-sales world, we began talking of the Journey to the Cloud. It became a trope.  We’re still on that journey. The thing is, the ecosystem that surrounds the public cloud is becoming as robust as the ecosystem that exists surrounding standard, on-prem workloads.

interop.logo.2.jpg

I'm flying home after another incredible Interop experience. It’s the perfect time to capture the conversations, ideas, and feelings I experienced this week in the desert, before they fade like the tan lines I got while waiting ten minutes outside for an Uber.

 

100Gbps (The summary)

 

If money was no object, I would honestly say that this should be on our MUST ATTEND list every year. Even as a conference newbie who probably missed a ton of opportunities along the way, Interop generated an incredibly diverse set of interactions, stories, and ideas.

 

Even if money is an object (which happens to be true for most people and organizations), I would still say that making Interop a priority would reap rewards that totally justify the expense.

 

While vendors are certainly present at Interop, the overall tone is refreshingly agnostic compared to events like Cisco Live, Microsoft Ignite, and VMworld. That means sessions are more focused on the real shortcomings of products and solutions, which allows for conversations about work-arounds, alternatives, and comprehensive solutions.

 

It's not hard to guess what the big stories were at the show this year: cloud, security, and SDN all had places in the sun. More surprising was the level to which the DevOps narrative bled into conversations that were once considered pure networking.

 

Fat Pipe (The details)

  1. One example of that DevOps/NetOps transition was a talk by Jason Edelman about using Ansible to perform configuration backups on legacy (meaning SSH-connected, command-line driven) network devices. While it might sound strange to the THWACKâ community, familiar as we are with tools like NCM, it represents an extension of existing skills and technology to teams that are used to using Ansible to deploy and manage cloud- and hybrid-cloud based environments.

 

  1. There were also a few deep-dive sessions on building and leveraging coding skills, such as Pythonä for network outcomes, mostly in relationship to SDN, NVF, and the like.

 

This, in turn, led to an ongoing dialogue between speakers and attendees in several sessions on the best ways for network professionals to identify, acquire, and develop new skills that will allow them to make the leap to the new age of networking.

 

All of this built up to a narrative that was best championed during Martin Casados’ keynote. In one of the best comparisons I've heard to date, Casados compared the current movement from traditional data centers, networking, server, and storage to the evolution from in-car navigation systems to running Waze on your phone.

 

He pointed out that every layer of the data center that once featured specialized hardware-based solutions are now completely contained at the software layer.

 

This overall shift is leading to the "rise of the developer,” as Casados put it. This means no silo will be safe from hardware being optimized by a software solution. It also means developers will have more influence over choosing operational frameworks, i.e., the solutions that run the business.

 

  1. Developers, Casados pointed out, care little for Gartnerâ, or vendor-specific certifications that tie IT pros to specific solutions, or sales relationships, or the vagaries of bureaucratic procurement cycles.

 

The result is that this shift in software-as-infrastructure has the potential to disrupt everything we used to know about the business of IT. 

 

Packet Footer (Summary)

Were you at InterOp and saw/heard/discussed something I missed? Do you have a different take than mine? Do you want to hear more on a specific topic? Let me know in the comments below!

 

All of this and more (I haven't even gotten into the discussions about IoT, SDN, or IPv6 that I was able to participate in), made this one of the best conferences I have attended in a very long time.

 

It got me even more excited for conferences to come. Next up is CiscoLive in Las Vegas, July 10-14. I hope to see you there!

Interop 2016 kicked off the week with two days of IT summits that covered an amazing range topics, including cloud, containers, and microservices, IT Leadership, and cybersecurity, plus hands-on hacking tutorials. The following three days included the Expo floor opening as well as the session tracks.

 

Since the IT Leadership Summit was sold out, I decided to join the Dark Reading Cyber Security Summit Day 1. I was only planning on attending Day 1, but the content was so good that I eschewed Container Summit and attended Dark Reading's Day 2. To kick things off, the editors at Dark Reading shared some interesting insights followed by industry thought leaders.

DevOps-Sec.png

DevOps - SecOps Relational image via @petecheslock and his Austin DevOps Days 2015 presentation.

 

My top 10 takeaways from the Dark Reading Cybersecurity Summit Days are below.

  1. $71.1B was spent on cybersecurity last year.
  2. Security pros spend most of their time patching legacy stuff and fixing vulnerabilities versus addressing targeted, sophisticated attacks, which happens to be their primary security concern. Number two is phishing and social engineering attacks.
  3. Security is one of the most important priorities and one of the least resourced by IT organizations. Security pros make policy decisions, but non-security people make purchasing decisions.
  4. The weakest link is the end-user, who make up the surface area of vulnerability.
  5. There are not enough skilled security ops people. 500K to 2M more security pros are needed by 2020.
  6. The most talented security pros are hackers.
  7. The average time to detect an intrusion is 6-7 months.
  8. 92% of the intrusions, incidents, and attacks of the past 10 years fall into nine distinct patterns, which can be further reduced down to three.
  9. The cost of a breach is roughly $254 per record for breaches, including 100 records, while $0.09 per record for breaches involving 100M records. Note that the cost is a multi-variable function with many dimensions to factor in.
  10. Only 40% of attacks are malware, so stopping malware is not enough.

 

Attached below is my DART IT Skills Framework presentation from my Interop IT Leadership speaking session. One of the CIO's SLA is security, so the Cybersecurity Summit was timely.

 

Let me know what you think of the security insights, as well as my presentation below, in the comment section. I would be happy to present my DART session to our community if there is enough interest, so let me know and I will make it so.

sqlrockstar

The Actuator - May 11th

Posted by sqlrockstar Employee May 11, 2016

I'm back from Liverpool and SQLBits. It was a brilliant event, as always. If you were there I hope you came by to say hello.

 

Here's this week's Actuator, filled with things I find amusing from around the Internet...

 

What is ransomware and how can I protect myself?

You recover from backups. If you don't have backups then you are hosed.

 

Ivy League economist ethnically profiled, interrogated for doing math on American Airlines flight

To be fair, he is a member of the al-Gebra movement, and was carrying weapons of math instruction.

 

The Year That Music Died

Wonderful interactive display of the top five songs every day since 1958. Imagine if you had this kind of interaction with your monitoring data, with some machine learning on top.

 

Apple Stole My Music. No, Seriously.

Since we are talking about music, here's yet another reason why reading the fine print is important.

 

Apple's Revenue Declines For The First Time In 13 Years

I am certain it has *nothing* to do with the issues inherent in their software and services like Apple Music. None.

 

The Formula One Approach to Security

This article marks the first time I have seen the phrase "security intelligence" and now I'm thinking it will be one of the next big buzzwords. Still a great read and intro to NetFlow for those that haven't heard about that yet.

 

Study: Containers Are Great, but Skilled Admins Are Scarce

I wonder how long they spent studying this. I believe it's always been the case that skilled admins are scarce, which is why we have so many accidental admins in the world. There's more tech work available than tech people available.

 

My secret to avoiding jet lag for events revealed:

NDQE7187 copy.jpg

In the world of networking, you would be hard pressed to find a more pervasive and polarizing topic than that of SDN. The concept of controller-based, policy-driven, and application-focused networks has owned the headlines for several years as network vendors have attempted to create solutions that allow everyone to operate with the optimization and automation as the large Web-scale companies do. The hype started in and around data center networks, but over the past year or so, the focus has sharply shifted to the WAN, for good reason.

 

In this three-part series we are going to take a look at the challenges of current WAN technologies, what SD-WAN brings to the table, and what some drawbacks may be in pursuing an SD-WAN strategy for your network.

 

Where Are We Now?

 

In the first iteration of this series, we’re going to identify and discuss some of the limitations in and around WAN technology in today’s networks. The lists below are certainly not comprehensive, but speak to the general issues faced by network engineers when deploying, maintaining, and troubleshooting enterprise WANs.

 

Perspective – The core challenge in creating a policy-driven network is perspective. For the most part, routers in today's networks make decisions independent of the state of peer devices. While there certainly are protocols that share network state information (routing protocols being the primary example), actions based off of this exchanged information are exclusively determined through the lens of the router's localized perspective of the environment.

 

This can cause non-trivial challenges in the coordination of desired traffic behavior, especially for patterns that may not follow the default/standard behavior that a protocol may choose for you. Getting every router to make uniform decisions, each utilizing a different perspective, can be a difficult challenge and add significant complexity depending on the policy trying to be enforced.

 

Additionally, not every protocol shares every piece of information, so it is entirely possible that one router is making decisions off of considerably different information than what other routers may be using.

 

Application Awareness - Routing in current generation network is remarkably simple. A router considers whether or not it is aware of the destination prefix, and if so, forwards the packet on to the next hop along the path. Information outside of the destination IP address is not considered when determining path selection.  Deeper inspection of the packet payload is possible on most modern routers, but that information does not play into route selection decisions. Due to this limitation in how we identify forwarding paths, it is incredibly difficult to differentiate routing policy based off of the application traffic being forwarded.

 

Error Detection/Failover – Error detection and failover in current generation routing protocols is a fairly binary process. Routers exchange information with their neighbors, and if they don’t hear from them in some sort of pre-determined time window, they tear down the neighbor relationship and remove the information learned from that peer. Only at that point will a router choose to take what it considers to be an inferior path. This solution works well for black-out style conditions, but what happens when there is packet loss or significant jitter on the link? The answer is that current routing protocols do not take these conditions into consideration when choosing an optimal path. It is entirely possible for a link to have 10% packet loss, which significantly impact voice calls, and have the router plug along like everything is okay since it never loses connection with its neighbor long enough to tear down the connection and choose an alternate path. Meanwhile, a perfectly suitable alternative may be sitting idle, providing no value to the organization.

 

Load Balancing/Efficiency - Also inherent in the way routing protocols choose links is the fact that all protocols are looking to identify the single best path (or paths, if they are equal cost) and make it active, leaving all other paths passive until the active link(s) fail. EIGRP could be considered an exception to this rule as it allows for unequal cost load balancing, but even that is less than ideal since it won’t detect brown-out conditions on a primary link and move all traffic to the secondary. This means that organizations have to purchase far more bandwidth than necessary to ensure each link, passive or active, has the ability to support all traffic at any point. Since routing protocols do not have the ability to load balance based off of application characteristics, load balancing and failover is an all or nothing proposition.

 

As stated previously, the above list is just a quick glance at some of the challenges faced in designing and managing the WAN in today’s enterprise network.  In the second part of this series we are going to take a look at what SD-WAN does that helps remediate many of the above challenges.  Also keep your eyes peeled for Part 3, which will close out the series by identifying some potential challenges surrounding SD-WAN solutions, and some final thoughts on how you might take your next step to improving your enterprise’s WAN.

Did the title of this blog entry scare you and make you think, "Why in the world would I do that?"  If so, then there is no need to read further.  The point of this blog post is not to tell you why you should be doing so, only why some have chosen to do so, and what issues they find themselves dealing with after having done so. If you still think that the idea of moving any of your data center to the cloud is simply ludicrous, you may go back to your regularly scheduled programming.

 

If the demand for on your company's IT resources is consistent throughout the week and year, then the biggest reason for moving to the cloud really doesn't apply to you.  Consider how Amazon Web Services (AWS) got built. They discovered that most of the demand on their company's IT resources came from a few days of the year: Black Friday, Mother's Day, Christmas, etc. The rest of the year, the bulk of their IT resources were going unused. They asked themselves whether there might be other people who had the need for their IT resources when they weren't using them, and AWS was born. It has, of course, grown well beyond the simple desire to sell excess capacity into one of their most profitable business lines.


If your company's IT systems have a demand curve like that, then the public cloud might be for you. Why pay for servers to sit there for an entire year when you can rent them when demand is high and give them back when demand is low?  In fact, some companies even rent extra computing capacity by the hour when the demand is high. Imagine being able to scale the capabilities of your data center within minutes in order to meet the increased demand created by a Slashdot article or a viral video. This is the reason to go to the cloud. Then, once the demand goes down, simply give that capacity back.

 

The challenge for IT people looking to replace portions of their data center with the public cloud is automating it, and making sure that what they automate fits within the budget.  While a public cloud vendor can typically scale to whatever demand level you find yourself with, the bill will automatically scale as well. Unless the huge spike in demand is directly related to a huge spike in sales, your CFO might not take kindly to an enormous bill when your video goes viral. Make sure you plan for that ahead of time so you don't end up having to pay a huge and unexpected cost. Perhaps the decision will be made to just let things get slow for a while. After all, that ends up in the news, too. And if you believe all publicity is good publicity, then maybe it wouldn't be such a bad thing.

 

There are plenty of companies that have replaced all their data centers with the cloud. Netflix is perhaps the most famous company that runs their entire infrastructure in AWS.  But they argue that the constant changes in demand for their videos make them a perfect match for such a setup. Make sure the way your customers use your services is consistent with the way the public cloud works, and make sure that your CFO is ready for the bill if and when it happens. That's how to move things into the cloud.

As an avid cloud user, I'm always amused by people who suggest that moving things to the cloud means you don't have to manage them.  And, of course, when I say "amused," what I really mean is I feel lnigo Montoya in Princess Bride.  "You keep using that word.  I do not think it means what you think it means."

 

Why do I say this?  Because I am an avid cloud user and I manage my cloud assets all the time.  So where do we get this idea?  I'd say it starts with the idea that you don't have to manage the hardware.  Push a few buttons and a "server" magically appears in your web browser.  This is so much easier than creating a real server, which actually works similarly these days.  Push a few buttons on the right web site, and an actual server shows up at your front door in a few days.  All you have to do is plug it in, load the appropriate OS and application stack and you're ready to go.  The cloud VM is a little bit easier.  It appears in minutes and comes preloaded with the OS and application stack that you specified during the build process.

 

I think what most people think when they say their cloud resources don't need to be managed is that they don't have to worry about the hardware.  They know that the VM is running on highly resilient hardware that is being managed for them.  They don't have to worry about a failed disk drive, network controller, PCI card, etc.  It just manages itself. But anyone who thinks this is all that needs to be managed for a server must never have actually managed any servers.

 

There are all sorts of things that must be managed on a server that have nothing to do with hardware.  What about the filesystems?  When you create the VM, you create it with a volume of a certain size.  You need to make sure that volume doesn't fill up and take your server down with it.  You need to monitor the things that would fill it up for no reason, such as web logs, error logs, database transaction logs, etc.  These need to be monitored and managed.  Speaking of logs, what about those error logs?  Is anyone looking at them? Are they scanning them for errors that need to be addressed?  Somebody should be, of course.

 

Another thing that can fill up a filesystem is an excessive number of snaphshots.  They need to be managed as well.  Older snapshots need to be deleted and certain snapshots may need to kept for longer periods of time or archived off to different medium. Snapshots do not manage themselves.

 

What about my favorite topic of backups?  Is that VM getting backed up?  Does it need to be?  If you configured it to be backed up, is it backing up?  Is anyone looking at those error logs?  One of the biggest challenges is figuring out when a backup didn't run. It's relatively easy to figure out when a backup ran but failed; however, if someone configured the backup to not run at all, there's no log of that.  Is someone looking for backups that just magically disappeared?


Suffice it to say that the cloud doesn't remove the need for management.  It just moves it to a different place.  Some of these things may be able to be offloaded to the cloud vendor, of course.  But even if that's the case someone needs to watch the watcher.  There is no such thing as free lunch and there is no such thing as a server that manages itself.

Network variation is hurting us

Network devices like switches, routers, firewalls and load-balancers ship with many powerful features. These features can be configured by each engineer to fit the unique needs of every network. This flexibility is extremely useful and, in many ways, it's what makes networking cool. But there comes a point at which this flexibility starts to backfire and become a source of pain for network engineers.

Variation creeps up on you.  It can start with harmless requests for some non-standard connectivity, but I've seen those requests grow to the point where servers were plugging straight into the network core routers.  In time, these one-off solutions start to accumulate and you can lose sight of what the network ‘should’ look like.  Every part of the network becomes its own special snowflake.

I’m not judging here. I've managed quite a few networks and all of them end up with high-degrees of variation and technical debt. In fact, it takes considerable effort to fight the storm of snowflakes. But if you want a stable and useful network you need to drive out variation. Of course you still need to meet the demands of the business, but only up to a point. If you're too flexible you will end up hurting your business by creating a brittle network which cannot handle changes.

Your network becomes easier and faster to deploy, monitor, map, audit, understand and fix if you limit your network to a subset of standard components. Of course there are great monitoring tools to help you manage messy networks, but you’ll get greater value from your tools when you point them towards a simple structured network.

What’s so bad about variety?

Before we can start simplifying our networks we have to see the value in driving out that variability. Here are some thoughts on how highly variable (or heterogeneous) networks can make our lives harder as network engineers:

  • Change control - Making safe network change is extremely difficult without standard topologies or configurations. Making a change safely requires a deep understanding of the current traffic flows - and this will take a lot of time. Documentation makes this easier, but a simple standardized topology is best. The most frustrating thing is that when you do eventually cause an outage, the lessons learned from your failed change cannot be applied to other dissimilar parts of your network.
  • Discovery time can be high. How do you learn the topology of your network in advance of problems occurring? A topology mapping tool can be really helpful to reduce the pain here, but most people have just an outdated visio diagram to rely on.
  • Operations can be a nightmare in snowflake networks.  Every problem will be a new one, but probably one that could have been avoided - it's likely that you'll go slowly mad. Often you'll start troubleshooting a problem and then realize, ‘oh yeah, I caused this outage with the shortcut I took last week. Oops’.  By the way, it’s a really good sign when you start to see the same problems repeatedly. Operations should be boring, It means you can re-orient your Ops time towards 80/20 analysis of issues, rather that spending your days firefighting.
  • Stagnation -  You won't be able to improve your network until you simplify and standardize your network. Runbooks are fantastic tools for your Ops and Deployment teams, but the runbook will be useless if the steps are different for every switch in your network. Think about documenting a simple task...if network Y do step1, except if feature Z enabled then do something else, except if it’s raining or if it's a leap year.  You get the message.
  • No-Automation - If your process it too complicated to capture in a runbook you shouldn't automate it. Simplify your network, then your process, then automate.

 

Summary

Network variation can be a real source of pain for us engineers. In this post we looked at the pain it causes and why we need to simplify and standardize our networks. In Part 2 we'll look at the root causes for these complicated, heterogenous networks and how we can begin tackling the problem.

Filter Blog

By date:
By tag: