Skip navigation
1 14 15 16 17 18 Previous Next

Geek Speak

2,209 posts

The Rolling Stones once wrote a song about how time waits for no one, but the inverse is also true today. These days, no one waits for time; certainly not government personnel who depend on speedy networks to deliver mission-critical applications and data.

 

Fortunately, agency administrators can employ deep packet-level analysis to ensure the efficiency of their networks and applications. Packet-level analysis involves capturing and inspecting packets that flow between client and server devices. This inspection can provide useful information about overall network performance, including traffic and application response times, while fortifying network security.

 

Before we get into how this works, let’s take a minute to go back to the concept of time – specifically, network response time (NRT), also known as network path latency. NRT measures the amount of time required for a packet to travel across a network path from sender to receiver. When latencies occur, application performance can be adversely impacted.

 

Some applications are more prone to latency issues, and even lower bandwidth applications aren’t completely immune. End-users commonly think that these problems are the result of a “slow network,” but it could be the application itself, the network, or a combination of both.

 

Packet analysis can help identify whether the application or network is at fault. Managers can make this determination by calculating and analyzing both application and network response time. This allows them to attack the root of the problem.

 

They can also use analysis to calculate how much traffic is using their networks at any given time. This is critically important for two reasons: first, it allows administrators to better plan for spikes in traffic, and second, it can help them identify abnormal traffic and data usage patterns that may indicate potential security threats.

 

Additionally, administrators can identify which applications are generating the most traffic. Packets can be captured and analyzed to determine data volume and transactions, among other things. This can help managers identify applications and data usage that may be putting a strain on their networks.

 

The challenge is that, traditionally, packet-level analysis has typically been either too difficult or expensive to manage. There’s a free powerful open source tool called Wireshark, but it’s also a bit difficult to wrangle for those who may not be familiar with it. Many proprietary tools are full-featured and easier to use, but expensive.

 

The good news is that some standard network monitoring tools now include packet analysis as another key feature. That makes sense, because packet analysis can play an important – and very precise – role in making sure that networks continue to run efficiently. As a result, federal IT administrators now have more options to reach deep into their packets and honor the words that Mick Jagger once sang: “Hours are like diamonds. Don’t let them waste.”

 

Find the full article on our partner DLT’s blog, TechnicallySpeaking.

This is the last of a 3-part series, which is itself is a longer version of a talk I give at conferences and conventions.

You can find part 1 here, and you can find part 2 here

Now that I'm wrapping it up, I would love to hear your thoughts, suggestions, and ideas in the comments below!

 

In the last two sections of this series, I made a case for WHY unplugging should be important to us as IT Professionals, and I began to dig into specific examples of HOW we can make unplugging work for us. What follows are some additional techniques you can adapt for your own use, as well as some ways to frame your time away so that you avoid the FUD that can come with trying something new and potentially different from what our colleagues are doing.

 

Perspective is Key

Along with planning, another key to successfully disconnecting is to develop a healthy perspective.

 

Try this for the next few days: Note how you are contacted during real emergencies (and how often those emergencies actually happen).

 

It's easy to fall into the trap of answering every call, jumping screens at the sound of a bell or tweet, checking our phone at two-minute intervals, and so on, when NOTHING is actually that important or urgent.

 

Develop an awareness of how often the things you check turn out to be nothing, or at least nothing important.

 

Change the way you think about notifications. Mentally re-label them interruptions and then see which matter. Pay attention to the interruptions. That's where you lose control of your life.

interruptions.jpg

 

If someone really needed you or needed to tell you something, they wouldn't do it in a random tweet. They wouldn't tag you in a photo. They probably wouldn't even send it as a group text. When people want you to know something, they use a very direct method and TELL you.

 

So once again, take a deep breath. Learn to reassure yourself that you aren't going to miss anything important. Honest.

 

Prioritization is Key

For people like me, going offline is pretty much an all or nothing deal. As I said earlier, if it has an on switch, it's off limits for me and my family.

 

But that doesn't have to be the case. You can choose levels of connectivity as long as they don't get the best of you.

 

A good example of this is your phone. Most now support an ultra, super-duper power saving mode, which has the unintended benefit of turning off everything except... you know... the phone part. With one swipe you can prioritize direct phone calls while eliminating all the distractions that smartphones represent. You can also set different applications manually to interrupt – I  mean notify – you or not, so that you only receive the interruptions that matter.

 

As long as we're talking about prioritization, let's talk about getting work done. Despite your nagging suspicion to the contrary, your technology was not protecting you from the Honey Do list. It was just pushing the items on your list to the point where you had to work on them later in the day or week, and at a time when you are even less happy about it than you would have been otherwise.

 

Use your unplugged time to prioritize some of the IRL tasks that are dragging you down. I know it sounds counterintuitive, but it is actually easier to get back to work when you know the gutters are clean.

 

As challenging as it sounds, you might also need to prioritize who you get together with on your day off the grid. Don't purposely get involved with friends who spend their weekends gaming, live-tweeting, etc. There's nothing wrong with those things, of course, but you're not really offline if you keep telling your buddy, "Tweet this for me, okay?”

 

Yes, this may change who you associate with and when. But don't try to be offline when everyone else around you is online. That's like going on a diet and forcing your friends to eat vegan chili cheese dogs.

 

But What About...

Hopefully this has gotten you thinking about how to plan for a day away from the interwebz. But there's still that annoying issue of work. Despite claims of supporting work-life balance, we who have been in IT for more than 15 minutes understand that those claims go out the window when the order entry system goes down.

 

The answer lies partly with prioritization. If you've made your schedule clear (as suggested earlier) and the NOC still contacts you, you'll need to make a judgement call about how or if you respond.

 

Spoiler Alert: Always opt for keeping a steady paycheck.

 

Speaking of which, on-call is one of those harsh realities of IT life that mangle, if not outright destroy, work-life balance. It's hard to plan anything when a wayward email, text, or ticket forces you to go running for the nearest keyboard.

on-call.jpg

 

If you are one of those people who is on-call every day of the year around the clock, I have very little advice for you to go offline, and honestly you have bigger fish to fry. Because that kind of rat race gets old fast.

 

On the other hand, I have a ton of experience coordinating rotating on-call with offline. Now, I don't want you to think that I've negotiated this upfront on every job I've held. I have had managers who respected my religious schedule and worked around it, and others who looked me in the eye and said my religion was my problem to solve. Here's what I've learned from both experiences:

 

First, the solution will ultimately rest with your coworkers. Not with your manager and certainly not with HR. If you can work out an equitable solution with the team first, and then bring it to management as a done deal, you're likely home free.

 

Second, nobody in the history of IT has ever said they loved an on call schedule; and everyone wants more options. YOU, dear reader, represent those options. In exchange for your desired offline time, you can offer to trade coworkers and cover their shift. You wouldn't believe how effective this is until you try it. In a few rare cases, I've had to sweeten the deal with two-for-one sales ("I'll take your Sunday and Monday for every Saturday of mine"), but usually just swapping one day for another is more than enough. Another trick is to take your coworker's entire on-call week in exchange for them taking that number of your offline days during your on call rotation.

 

Yet another trick: My kids school schedule is extremely non-standard. They have school on Sunday and don't get days off for Christmas, Thanksgiving, or most of the other major national holidays. So I can graciously offer to cover prime time days like Thanksgiving in exchange for them taking my time off. In essence, I'm leveraging time when my family isn’t going to be home, anyway.

 

The lesson here is that if you have that kind of flexibility, use it to your advantage.

 

But what about perception? If you unplug regularly, won't people notice and judge you?

 

First, don't overthink it. When people get wind of what you are doing, you're more likely to receive kudos than criticism, and more than a few wistful comments along the lines of, “I wish I could do that."

 

Second, if you followed my suggestions about communicating and prioritizing - the right people knew about your plans AND you remained flexible in the face of an actual crisis - then there really shouldn't be any question. In fact, you will have done more than most IT folks ever do when they walk out the doors.

 

So that leaves the issue of using your evenings and weekends to get ahead with technology, so you can be the miracle worker when Monday rolls around.

 

While I understand the truth of this comic:

11th_grade.png

 

I'll put a stake in the ground and say that few - if any - people saved their job, won the bonus, or got the promotion because they consistently used personal time to get work done. And for those few who did, I'd argue that long term it wasn't worth it for all the reasons discussed at the beginning of this essay.

 

It's also important to point out that managers, departments, or companies that require this level of work and commitment are usually dangerously toxic. If you find yourself in that situation, you will be doing your long-term happiness and career a favor, even if your bank account isn't happy in the short term.

 

To sum up: Learning to disconnect regularly and for a meaningful amount of time offers benefits to your physical health, your peace of mind, and even your career; and there are no insurmountable challenges in doing so, regardless of your business sector, years of experience, or discipline within IT.

 

The choice is yours. At the start of this series, I dared you to just sit and read this article without flipping over to check your phone, email, twitter feed, etc. Now, if you made it to the end of these essays without checking those blinking interrupti... I mean notifications, then you have my heartfelt gratitude as well as my sincere respect.

 

If you couldn’t make this far, you might want to think about why that is, and whether you are okay with that outcome. Maybe this is an opportunity to grow, both as an IT professional and as a person.

 

That's it folks! I hope you have gained a few insights that you didn't already have, and that you'll take a shot at making it work for you. Let me know your thoughts in the comments below.

This week we kicked off the December Writing Challenge, and the response has been incredible. Not just in volume of people who have read it (over 800 views) or commented (60 people and counting - all of whom will get 200 THWACK points for each of their contributions!), but also in the quality of the responses. And that's what I wanted to share today.

 

First, if you are having trouble finding your way, the links to each post are:

 

So here are some of the things that leapt out at me over the last 3 days:

 

Day 0: Prepare

First, I have to admit this one was a sneaky trick on my part, since it came out on Nov 30 and caught many people unprepared (#SeeWhatIDidThere?). Never the less, a few of you refused to be left out.

 

KMSigma pointed out:

"IT is (typically) an interrupt-driven job.  Sure, you have general job duties, but most are superseded by the high priority email from the director, the alert that there is a failing drive in a server, the person standing in your cube asking for the TPS report, the CIO stating that they just bought the newest wiz-bang that they saw at the trade show and you need to implement it immediately.  Regardless of what is causing the interruptions, your "normal" daily duties are typically defined by the those same interruptions.

 

So, how can you plan for interruptions?  Short answer is that you can't, but you can attempt to mitigate them."

 

Meanwhile, sparda963 noticed the connection to the old Boy Scout motto, and said:

"Instead of keeping rope, emergency food, matches, water filter, and other endless supplies within reasonable reach I keep things like tools, utilities, scanners and the such around."

 

Finally (for this day) zero cool noticed (not incorrectly), that

"Preparing for a days work in IT is like preparing for trench warfare.  You need to be a tech warrior and have a good plan of attack on how to communicate with EUs and prioritize their requests (demands). "

 

Moving to Day One (Learn), some highlights included:

bsciencefiction.tv spoke for many when he said

"The ability to learn is in my opinion one of the greatest tools in the kit for today’s IT professional. It is the ability to adapt and change to a world that is nowhere near static.  It is the skill to not just master a task but understand the concept as well."

 

Many others pointed out that learning is an active process that we have to be engaged with, not passively consume. And also that, as rschroeder commented,

"The day I stop learning is the day I die."

 

There were so many other amazing insights into how, why, and what to learn that you really should check them out.

 

But that brings us to today's word: Act.

miseri captured the essence of what many others said in the quote

"I don't trust words, I trust actions."

 

tinmann0715 was even able to honor the thoughts of his high school principal (even if he wasn't able to appreciate them at the time), who shared the motto:

"If it is to be it is up to me!"

 

And bleggett continued what is becoming a burgeoning Word Challenge trend to put thoughts into haiku with:

"Alerts that tell us

Charts that show us what we seek

Think before you act."

 

All in all there were some incredible ideas and personal stories shared. I appreciate everyone taking time out of their busy lives to share a piece of themselves in this way.

 

In the coming weeks, the "lead" article will come from other Head Geeks as well as folks from across the spectrum of the SolarWinds corporate community - members of the video team, editorial staff, product manageent, and more will share their stories, feelings, and reactions to each day's prompt.

 

Until next week...

Well hey everybody, I hope the Thanksgiving holiday was kind to all of you. I had originally planned to discuss more DevOPS with ya’ll this week however a more pressing matter came to mind in my sick and weakened state of stomach flu!

 

Lately we’ve been discussing ransomware but more important, lately I’ve been seeing an even greater incidence of ransomware affecting individuals and businesses, and worse when it would hit a business it would have a lot of collateral damage (akin to encrypting the finance share that only cursory access was allowed to or such)

 

KnowBe4 has a pretty decent Infographic on Ransomware I’m tossing in here and I’m curious what ya’ll have been seeing in this regards.

Do you find this to be true, an increased incidence, a decrease, roughly the same?

 

Ransomware-Threat-Survey.jpg

 

Some real hard and fast takeaways I’ve seen from those who aspire to mitigate ransomware attacks is to Implement:

 

  • Stable and sturdy firewalls
  • Email filtering scanning file contents and blocking attachments
  • Comprehensive antivirus on the workstation
  • Protected Antivirus on the servers

 

Yet all too often I see all of this investment around trying to ‘stop’ it from happening without a whole lot left to handling clean-up should it hit the environment, basically… Having some kind of backup/restore mechanism to restore files SHOULD you be infected.

 

Some of the top ways I’ve personally seen where Ransomware has wrought havoc in an environment have happened in the cases of; 

  • Using a work laptop on an untrusted wireless network
  • Phishing / Ransomware emails which have links instead of files and opening those links
  • Opening a “trusted” file off-net and then having it infect the environment when connected
  • Zero Day Malware through Java/JavaScript/Flash/Wordpress hacks (etc)

 

As IT Practitioners not only do we have to do our daily jobs, and the business to keep the lights on, and focus on innovating the environment, and keeping up with the needs of the business.   Worst of all when things go bad, and few things are as bad as Ransomware attacking and targeting an environment, then we have to deal with that on a massive scale! Maybe we’re lucky and we DO have backups, and we DO have file redirect so we can restore off of a VSS job, and we can detect encryption in flight and stop things from taking effect.   But that’s a lot of “Maybe” from end-to-end in any business and all of the applicable home devices that may be in play.  

 

There was a time when Viruses would break out in a network and require time and effort to cleanup, but at best it was a minor annoyance.  Worms would breakout and so long as we stopped whatever was the zero-day trigger we could stop it from occurring on the regular.   And while APTs and the like are more targeted threats this was less of a common occurrence for us to deal with where it would occupy our days as a whole.   But Ransomware gave thieves a way to monetize their activities, which gives incentives to infiltrate and infect our networks.   I’m sure you’ve seen the Ransomware now offering Helpdesk to assist victims with paying?

 

 

It’s definitely a crazy world we live in, one which leaves us only with more work to do on a daily basis, a constant effort to fend off and fight against.  This is a threat which has been growing at constant pace and is leaking and growing to infect Windows, Mac AND Linux.

 

What about your experiences, do you have any attack vectors for Ransomware you’d like to share, or other ways you were able to fend them off?  

Software Defined WAN is easily the most mature flavor of SDN. Consider how many large organizations have already deployed some sort of SD-WAN solution in recent years. It’s common to hear of organizations migrating dozens or even thousands of their sites to an entirely SD-WAN infrastructure, and this suggests that SD-WAN is no longer an interesting startup technology but part of the mainstream of networking.

 

The reason is clear to me. SD-WAN provides immediate benefits to a business’s bottom line, so from a business perspective, SD-WAN just makes sense. SD-WAN technology reduces complexity, improves performance and greatly reduces cost of an organization’s WAN infrastructure. The technology offers the ability to replace super-expensive private MPLS circuits for cheap broadband without sacrificing quality and reliability. Each vendor does this somewhat differently, but the benefits to the business are so palpable that the technology really is an easy sell.

 

The quality of the public internet has improved greatly over the last few years, so being able to tap into that resource and somehow retain a high quality link to branch offices and cloud applications is very tempting for cost-conscious CIOs. How can we leverage cheap internet connections like basic broadband, LTE and cheap cable yet maintain a high-quality user experience?

 

Bye-bye private circuits.

 

This is the most compelling aspect for using this technology. Ultimately it boils down getting rid of private circuits. MPLS links can cost thousands of dollars per month each, so if an SD-WAN solution can dramatically cut costs, provide fault tolerance and retain a quality experience, the value is going with all public internet connections.

 

Vendors run their software on proprietary appliances that make intelligent path decisions and negotiate with remote end devices to provide a variety of benefits. Some offer the ability to aggregate dissimilar internet connections such as broadband and LTE, some tout the ability to provide granular QoS over the public internet, and some solutions offer the ability to fail over from one primary public connection to another public connection without negatively affecting very sensitive traffic such as voice or streaming video. Also, keep in mind that this is an overlay technology which means that using SD-WAN means your transport is completely independent from the ISP.

 

Sweet. No more 3-year contracts with a monolith service provider.

 

Most SD-WAN vendors offer some, if not all, of these features, and some are going a step further by offering their solution as a managed service. Think about it: if your company is already paying some large ISP thousands per month for Ethernet handoffs into their MPLS cloud, what’s the difference with an SD-WAN managed service handing off a combination of Ethernet, LTE, etc. interfaces into their SD-WAN infrastructure?

 

Especially for small and medium-sized multi-site businesses, the initial cost of switching from managed MPLS to a dramatically cheaper managed SD-WAN provider is nothing compared to the savings over only a few years of savings from dropping private circuits.

 

For organizations such as high-transaction financial firms that want to manage their entire WAN infrastructure themselves and require almost perfect, lossless connectivity, SD-WAN may be a harder sell, but for most businesses it’s a no-brainer.

 

Picture a retail company with many locations such as a clothing store, bank, or chain restaurant that needs simple connectivity to payment processing applications, files, and authentication servers. These types of networks would benefit tremendously from SD-WAN because new branch locations can be brought online very quickly, very easily, and much more inexpensively than when using traditional private circuits. Not only that, but organizations wouldn’t be locked into a particular ISP anymore.

 

This is mainstream technology now, and it’s something to consider seriously when thinking about designing your next WAN infrastructure. It’s cheaper, easier to deploy, and easier to switch ISPs. That’s the real value of SD-WAN and why even huge organizations are switching to this technology in droves.

DRSTROATH001_cov.jpg

(image courtesy of Marvel)

 

...I learned from "Doctor Strange"

(This is part 2 of a 4-part series. You can find part 1 here)

 

When fate robs you of your skills, you can always seek others

The catalyst for the whole story was an accident that damaged Strange's hands beyond repair, or at least beyond his ability to ever hold a scalpel again.

 

The corollary for IT pros happens when we lose a technology. Maybe the software vendor is bought out and the new owner stops developing the old tools. Maybe your company moves in a different direction.  Or maybe the tool you know best  simply becomes obsolete. Whatever the reason, we IT professionals have to be ready to, in the words of my colleague Thomas LaRock (sqlrockstar), learn to pivot.

 

The interesting thing is that, very much like Stephen Strange, most of the time when we are asked (or forced) to pivot, we find we are able to achieve results and move our career forward in ways we couldn't have imagined previously.

 

Leverage the tools you have to learn new tools

One of the smaller jokes in the movie is when Wong the librarian asks, "How's your Sanskrit?" Strange glibly responds, "I'm fluent in Google Translate.” (Side note: Google translates Tamil, Telugu, Bengali, Gujarati, Kannada, and Sindhi among other Indic languages. But Sanskrit is not yet on the list).

 

The lesson for us in IT is that often you can leverage one tool (or the knowledge you gained in one tool), to learn another tool. Maybe the menuing system is similar. Maybe there are complimentary feature sets. Or maybe knowing one solution gives you insight into the super-class of tools that both tools belong to. Or maybe it's as simple as having a subnet calculator or TFTP server that lets you get the simple jobs done faster.

 

There’s no substitute for hard work

It's important to note that Strange does, in fact, learn to read Sanskrit. He puts in the work so that he isn't reliant on the tool forever. In fact, Strange is rarely shown already knowing things. Most of the time, he's learning, adapting, and most frequently just struggling to keep up. But at the same time, the movie shows him putting in enormous amounts of work. He rips through books at a fearsome rate. He learns to project his astral form so that he can stretch out his sleeping hours and continue to read, absorb, and increase his base of knowledge. Obviously, he also has natural gifts, and tools, but he doesn't rest on either of those.

 

In IT, there really is no better way to succeed than to put in the work. Read the manual. Test the assumption. Write some sample code. Build a test network (even if you do it completely virtually). Join a forum (for example, http://www.thwack.com?), and ask some questions.

 

Experience, creativity, and powerful tools help save the day

At the climax of the movie, Strange defeats the Dread Dormammu, lord of the dark dimension, in a most curious way: He creates a temporal loop that only he can break, locking Dormammu and himself into an endless repetition of the same moment in time. Faced with the prospect of his own personal Groundhog Day, Dormammu agrees to leave the Earth alone. The interesting thing is that, by all accounts, Strange isn't the strongest sorcerer in the world. Nor is he the most experienced. He has a spark of creativity and a few natural gifts, but that's about it.

 

Anyone in IT should be all too familiar with this narrative. A willingness to use the tools at hand, along with some personal sacrifice to get the job done, is often how the day is saved. In the movie, the tool at hand was the Eye of Agamotto. In real life, the small but powerful tool is often monitoring, which provides key insights and metrics that help us cut straight to the heart of the problem with little effort or time wasted.

 

Ask people who’ve stood in your shoes how they moved forward

In the course of his therapy, Stephen Strange is referred to the case of Jonathan Pangborn, a man who suffered an irreparable spinal cord injury, but who Strange finds one day playing basketball with his buddies. Telling Pangborn that he is trying to find his own way back from an impossible setback, Strange begs him to explain how he did it. This is what sets the hero's path toward the mystical stronghold in Kathmandu.

 

In IT, we run up against seemingly impossible situations all the time. Sometimes we muscle through and figure it out. Sometimes we just slap together a kludgy workaround. But sometimes we find someone who has had the exact same problem, and solved it! We need to remember that many in our ranks have stood where we stand and solved what we hope to solve. There’s no need to struggle to re-invent an already existing solution. But to benefit from others' experience, we have to ASK.

 

That's where being part of a community, such as Stack Exchange or THWACK©, can pay off. I’m not talking about registering an account and then asking questions only when you get stuck. I mean joining the community, really getting involved, reading articles, completing surveys, adding comments, answering questions, and, yes, asking your own as they come up.

 

Even broken things can help you find your way

On his way to the mystical school of Kamar Taj, Doctor Strange is accosted by muggers and ordered to give up his watch. Even though he is rescued from what appears to be a brutal beating, his watch isn't so lucky. It's only later that we realize there's an inscription on the back that reads, "Only time will tell how much I love you,” indicating that the watch is from Christina, one of the few people Strange has made a personal connection with.

 

While the joke, "Even a broken clock is right twice a day" comes to mind, the lesson I'm thinking of is a little deeper. In IT, we often overlook the broken things, whether it's code that doesn't compile, a software feature that doesn't work as advertised, or hardware that's burnt out, in favor of systems and solutions that run reliably. And that's not a bad choice, generally speaking.

 

But our broken things can still teach us a lot. I've rarely learned anything from a server that ran like clockwork for months on end. But I've learned a lot about pin-outs, soldering, testing, timing, memory registers, and more when I've tried to get an old boat anchor working again.

 

Sometimes that knowledge transferred. Sometimes it didn't. But even if not, the work grounded me in the reality of the craft of IT, and gave me a sense of accomplishment and direction.

 

Did you find your own lesson when watching the movie? Discuss it with me in the comments below. And keep an eye out for parts 3-4, coming in the following weeks.

If you haven't read the earlier posts, here's a chance to catch up on the story so far:

 

  1. It's Not Always The Network! Or is it? Part 1 -- by John Herbert (jgherbert)
  2. It's Not Always The Network! Or is it? Part 2 -- by John Herbert (jgherbert)

 

Now you're up to speed with the chaotic life of the two characters whose jobs we are following, here's the third installment of the story, by Tom Hollingsworth (networkingnerd).

 

 

The View From Above: James (CEO)

 

I got another call about the network today. This time, our accounting department told us that their End of Year closeout was taking much too long. They have one of those expensive systems that scans in a lot of our paperwork and uploads it to the servers. I wasn't sure if they whole thing was going to be worth it, but we managed to pay for it with the savings from renting warehouse space to store huge file boxes full of the old paper records. That's why I agreed to sign off on it.

 

It worked great last year, but this time around I'm hearing nothing but complaints. This whole process was designed to speed things up and make everyone's job easier. Now I have to deal with the CFO telling me that our reports are going to be late and that the shareholders and the SEC are going to be furious. And I also have to hear comments in the hallways about how the network team still isn't doing their job. I know that Amanda has done a lot recently to help fix things, but if this doesn't get worked out soon the end of the year isn't going to be a good time for anyone.

 

 

The View From The Trenches: Amanda (Sr Network Manager)

 

Fresh off my recent issues with the service provider in Austin, I was hoping the rest of the year was going to go smoothly. Until I got a hotline phone call from James. It seems that the network was to blame for the end of year reporting issues that the accounting department was running into. I knew this was a huge issue after sitting in on the meetings about the records scanning program before I took over the network manager role. The arguments about the cost of that thing made me glad I worked in this department. And now it was my fault the thing wasn't working? Time to get to the bottom of this.

 

I fired up SolarWinds NPM and started checking the devices that were used by the accounting department. Thankfully, there weren't very many switches to look at. NPM told me that everything was running at peak performance; all the links to the servers were green, as was the connection between the network and the storage arrays. I was sure that any misconfiguration of the network would have shown up as a red flag here and given me my answer, but alas the network wasn't the problem. I could run a report right now to show to James to prove that the network was innocent this time.

 

I stopped short, though. Proving that it wasn't the network was not the issue; the issue was that the scanning program wasn't working properly. I knew that if it ended up being someone else's bigger issue that they were going to be on the receiving end of one of those conference room conversations that got my predecessor Paul fired. I knew that I had the talent to help this problem get fixed and help someone keep their job before the holidays.

 

So, if the network wasn't the problem, then what about the storage array? I called one of the storage admins, Mike, and asked him about the performance on the array. Did anything change recently? Was the firmware updated? Or out of date? I went through my standard troubleshooting questions for network problems. The answers didn't fill me with a lot of confidence.

 

Mike knew his arrays fairly well. He knew what kind they were and how to access their management interfaces. But when I started asking about firmware levels or other questions about the layout of the storage, Mike's answers became less sure. He said he thought maybe some of the other admins were doing something but he didn't know for sure. And he didn't know if there was a way to find out.

 

As if by magic, the answer appeared in my inbox. SolarWinds emailed me about a free trial of their Storage Resource Monitor (SRM) product. I couldn't believe it! I told Mike about it and asked him if he'd ever tried it. He told me that he had never even heard of it. Given my luck with NPM and keeping the network running, I told Mike we needed to give this a shot.

 

Mike and I were able to install SRM alongside NPM with no issues. We gave it the addresses of the storage arrays that the accounting data was stored on and let it start collecting information. It only took five minutes before I heard Mike growling on the other end of the phone. He was looking at the same dashboard I was. I asked him what he was seeing and he started explaining things.

 

It seems that someone had migrated a huge amount of data onto the fast performance storage tier. Mike told me that data should have been sitting around in the near-line tier instead. The data in the fast performance tier was using up resources that the accounting department needed to store their scanned data. Since that data was instead being written to the near-line storage, the performance hit looked like the network was causing the problem when in fact the storage array wasn't working like it should.

 

I heard Mike cup his hand over the phone receiver and start asking some pointed questions in the background. No one immediately said anything until Mike was able to point out the exact time and date the data was moved into the performance tier. It turns out one of the other departments wanted to get their reports done early this year and talked one of the other storage admins into moving their data into a faster performance tier so their reports would be done quicker. That huge amount of data had caused lots of problems. Now, Mike was informing the admin that the data was going to be moved back ASAP and they were going to call the accounting department and apologize for the delay.

 

Mike told me that he'd take care of talking to James and telling him it wasn't the network. I thanked him for his work and went on with the rest of my day. Not only was it not the network (again), but we found the real problem with some help from SolarWinds.

 

I wouldn't have thought anything else about it, but Mike emailed me about a week later with an update. He kept the SRM trial running even after we used it to diagnose the accounting department issue. The capacity planning tool alerted Mike that they were going to run out of storage space on that array in about six more weeks at the rate it was being consumed. Mike had already figured out that he needed to buy another array to migrate data and now he knew he needed a slightly bigger one. He used the tool to plan out the consumption rate for the next two years and was able to convince James to get a bigger array that would have more than enough room. It's time to convert that SRM trial into a purchase, I think; it's great value and I'm sure Mike will be only too happy to pay.

 

 

>>> Continue reading this story in Part 4

The “cloud” can mean so many things to different people. Depending on who you ask, it could mean SaaS ( software as a service ) running Salesforce in the cloud but another person may say it's running servers on AWS. The definition of cloud can be cloudy but the transition to cloud is the same regardless of what your putting there.

 

When you make that decision to transition the cloud, having a plan or tool kit is useful.  It’s very similar to an upgrade or deployment plan that I recently blogged about last month on Geek Speak called BACK TO BASICS TO HAVE A SUCCESSFUL UPGRADE. The same concept of project planning can be applied to transitioning to the cloud, with some minor tweaks and details to add.

 

Building you own “Cloud” Avengers…

 

If you want a smooth transition, it’s always best to get all the players involved from the start. Yes, that means networking, server team, application team and Security. I would say getting security involved from the start is key because they can shoot down plans because of not meeting some compliance standard, which then delays your transition. However, with security involved from the start means that you’re planning the right way from the start and will have a less likely chance of security delaying your project.  Getting everyone together, including the business (if applicable), gives everyone a chance to air out their grievances about the cloud and work together to make it a success.

 

“Cloud” Avengers assemble…

 

Now that you have your basic “Cloud” avengers core team built there are some common things that you should really ask with every cloud plan.

 

Disaster Recovery -  What is the DR plan for my cloud? If it is an application that is being moved to cloud, what are the DR plans for this application. Does the provider have DR plans for when their datacenter or servers decided to take a break and stop working? Are their DR plans for internet outages or DNS outages?


Backups - You should also be asking what are my back up options what is my recovery time if I need a restore. Lawsuits are common so how would an E-discovery situation be handled would be a question to ask. Where are the backups retained and for how long? How do you request data to be restored? Do the backup policies meet your in house policies?

 

Data retention – Something overlooked is data retention. How long does it stay in the cloud?  Each industry and business is different with different data retention periods so you will need to see if they meet your requirements. If there are deferring data retention periods, how does it impact your polices in house? Sometimes this may involve working your legal and compliance teams to come up with the best solution. E-discovery could also impact data retention periods so best to talk to the legal and compliance teams to make sure you are safe.

 

Data security - We all want to make sure our data secure so this should a standard question to ask. How is remote access handled and how easy can someone request access to the data. Is it as simple as sending an email or filling out the form? Does the provider have other means of authenticating that the correct person is requesting the data access? If you are running servers in the cloud you will want to know how the datacenters are secured. You will also want to know how the data is protected from antivirus if you are using SaaS and what are the remediation plans if data is compromised.

Back -out Plan -  If you are planning to transition to the cloud you should also have a back out plan. Sometimes you may find out it’s not all rainbows and sunny skies in the cloud and decide to come back to land. Asking the provider what are your options for backing out of the cloud is a good question to ask upfront because depending on your options this could impact your plan. You should also find out if there additional costs or fees for backing out. Something else that should also be asked is what happens if I want to leave the cloud and come back on premise what happens to my data and backups (if any existed). How has the data and can you get that back or does it get swallowed up by the cloud?

 

The cloud is the way of the future. As we move more and more data to the cloud, it may become less foggy. Until then plan as much as you can and ask all the questions you can ( even the stupid ones).

scuff

Hey Siri, fix my PC.

Posted by scuff Nov 30, 2016

If the machines are taking over the world, are they coming for our jobs too?

 

“Automate all the things!” is the current trend in our industry. Chef, Puppet and Ansible scream that they are the solution to the end of monotonous work. We script all the things, ending the days of clicking Next, Next, Next, Finish. We’re using machines and machine languages to build, update and alter other machines. Right now, they still need us. They’re just making our lives easier.

 

Or are they enabling us to take an acceptable step towards outsourcing our tasks …. to them?

 

This year Zendesk dipped their toes in the water with Automatic Answers. The feature “uses machine learning capabilities to analyze customer and agent actions over time, learning which articles solve tickets associated with specific keywords and topics. If a customer indicates their inquiry has been solved successfully, the ticket is closed. For tickets that remain unsolved, they proceed to the customer service team as normal.”  It’s easy to think of that in a B2C scenario, say if I’ve emailed a company asking about the status of a product return. Automatic Answers could glean enough information from my email to check another system and reply with an answer, minus any human interaction. With in-house tech support, maybe that frees up the Helpdesk from questions like “how do I give someone else access to my calendar?” or “how do I turn on my out of office replies?” DigitalGenius chief strategy officer Mikhail Naumov confirms that customer service is easy because a history of recorded answers is a goldmine for machines to learn appropriate responses from.

 

At the other extreme, we now have robots that can heal themselves and this technology has been around for more than 12 months.

 

Somewhere between the two sit our software machines. Without physical moving robot parts, the technology that we interact with from our desktops or mobiles boils down to a bunch of code on a hardware base. If it all comes down to binary, will it one day be able to fix itself?

 

Software developers might start to get worried. Grab a cup of coffee and read this article about how we’ll no longer write code to program machines, instead we’ll train them like dogs. Yippee says the girl who hates coding.

 

A toe in the water example is Microsoft’s ‘Troubleshooter’ capability. Still initiated by a human, it will look for known common causes of problems with Windows Updates, your network connectivity or Windows Store Apps.  Yes, I know, your results may vary, but it’s a start.

 

IBM was playing around with Autonomic Computing back in 2003. They mention automatic load balancing as an example of self-optimization which I guess is a very rudimentary autonomic task.

 

Now we’ve built some intelligence into monitoring, diagnostics and remote management. Some monitoring systems can attempt a pre-programmed resolution step for example (e.g. if the service stops, try and restart the service). There are even a few conferences on Cloud and Autonomic Computing http://icac2016.uni-wuerzburg.de/  http://www.autonomic-conference.org/iccac-2017/

 

But autonomic computing of the future looks to building systems that can monitor, react, protect and manage themselves without human intervention. Systems will be self-healing, self-configuring, self-protecting and self-optimizing. We won’t program automation anymore, we’ll train the systems what try when they are failing (or maybe train them to aid each other? Paging Dr Server!).

 

I’m not sure if that’s a future I’ll really looking forward to or if it scares the heck out of me. When I get flashbacks to server that won’t boot and log in after a failed Microsoft patch, I’d gladly settle for one that correctly identifies it was a bad patch, reboots & uninstalls and actually returns it to the previous good state, all automatically.

 

But maybe the service desk tickets and red dashboard icons are keeping me in a job? What would you do if the servers & networks could fix themselves?

I *may* have eaten my weight in turkey and stuffing last week. But the best part about the holiday was how I spent the better part of four days disconnected from just about everything. Disconnecting from time to time was the subject of a talk by adatole recently at a DevOpsDays event. Here's the video if you want to see Leon deliver a wonderful session on a topic that is important to everyone, inside and outside of IT.

 

Also, here's a bunch of other links I found on the Intertubz that you may find interersting, enjoy!

 

Great. Now Even Your Headphones Can Spy on You

As if I needed more reasons to be paranoid, apparently even my tinfoil hat won't help stop this threat.

 

Madison Square Garden, Radio City Music Hall Breached

A. Full. Year.

 

Shift Your Point of View to When America Was “Better”

Because I love data visualizations, and so should you.

 

How Long Did It Take To Make Food in Ancient Times?

Pretty sure if I had to wait 20 days to make some coffee my head would explode.

 

Oracle Bare Metal Cloud: Top Considerations and Use-Cases

The more I read pieces like this, the more I think Oracle is a sinking ship. Take this quote for example: "we begin to see that there is a market for public cloud consumption and the utilization of cloud services". Hey, Larry, it's 2016, most of us knew there was a market 10 years ago. And telling me that your Cloud will be better than other clouds becuase...why exactly?

 

Ransomware Result: Free Ticket to Ride in San Francisco

Get used to seeing more attacks like this one, but more disruptive. It wouldn't take much to shut down the trains altogether in exchange for a quick payout.

 

Fake News Is Not the Only Problem

A bit long, and politics aside, the takeaway for me here is the sudden realization by many that the Internet may not be the best source of news.

 

When your team is stuck, rub a little DevOps on your process and everything will be fine:

 

DevOps-lotion.jpg

great-db.png

I’ve stated often that great database performance starts with great database design. So, if you want a great database design you must find someone with great database experience. But where does a person get such experience?

 

We already know that great judgment comes from great experience, and great experience comes from bad judgment. That means great database experience is the result of bad judgment repeated over the course of many painful years.

 

So I am here today to break this news to you. Your database design stinks.

 

There, I said it. But someone had to be the one to tell you. I know this is true because I see many bad database designs out in the wild, and someone is creating them. So I might as well point my finger in your direction, dear reader.

 

We all wish we could change the design or the code but there times when it is not possible to make changes. As database usage patterns push horrible database designs to their performance limits database administrators are then handed an impossible task: Make performance better but don’t touch anything.

 

Imagine that you take your car to a mechanic for an oil change. You tell the mechanic they can’t touch the car in any way, not even open the hood. Oh, and you need it done in less than an hour. Silly, right? Well I am here to tell you that it is also silly to go to your database administrator and say: “we need you to make this query faster and you can’t touch the code”.

 

Lucky for us the concept of "throwing money at the problem” is not new as shown by this ancient IBM commercial. Of course throwing money at the problem does not always solve the performance issue. This is the result of not knowing what the issue is to begin with. You don’t want to be the one to spend six figures on new hardware to solve an issue with query blocking. Even after ordering the new hardware it takes time before arrival, installation, and the issue resolved.

 

That's why I put together this list of things that can help you fix database performance issues without touching code. Use this as a checklist to research and take action upon before blaming code. Some of these items cost no money, but some items (such as buying flash drives) might. What I wanted to do was to provide a starting point for things you can research and do yourself.

 

As always: You’re welcome.

 

Examine your plan cache

If you need to tune queries then you need to know what queries have run against your instance. A quick way to get such details is to look inside the plan cache. I’ve written before about how the plan cache is the junk drawer of SQL Server. Mining your plan cache for performance data can help you yield improvements such as optimizing for ad-hoc workloads, estimating the correct cost threshold for parallelism, or which queries are using a specific index. Speaking of indexes…

 

Review your index maintenance

Assuming you are doing this already, but if not then now is the time to get started. You can use maintenance plans, roll your own scripts, or use scripts provided by some Microsoft Data Platform MVPs. Whatever method you choose, make certain you are rebuilding, reorganizing, and updating statistics only when necessary. I’d even tell you to take time to review for duplicate indexes and get those removed.

 

Index maintenance is crucial for query performance. Indexes help reduce the amount of data that searched and pulled back to complete a request. But there is another item that can reduce the size of the data searched and pulled through the network wires…

 

Review your archiving strategy

Chances are you don’t have any archiving strategy in place. I know because we are data hoarders by nature, and only now starting to realize the horrors of such things. Archiving data implies less data, and less data means faster query performance. One way to get this done is to consider partitioning. (Yeah, yeah, I know I said no code changes; this is a schema change to help the logical distribution of data on physical disk. In other words, no changes to existing application code.)

 

Partitioning requires some work on your end, and it will increase your administrative overhead. Your backup and recovery strategy must change to reflect the use of more files and filegroups. If this isn’t something you want to take on then instead you may instead want to consider…

 

Enable page or row compression

Another option for improving performance is data compression at the page or row level. The tradeoff for data compression is an increase in CPU usage. Make certain you perform testing to verify the benefits outweigh the extra cost. For tables that have a low amount of updates and a high amount of full scans then data compression is a decent option. Here is the SQL 2008 Best Practices whitepaper on data compression which describes in detail the different types of workloads and estimated savings.

 

But, if you already know your workload to that level of detail, then maybe a better option for you might be…

 

Change your storage configuration

Often this is not an easy option, if at all. You can’t just wish for a piece of spinning rust on your SAN to go faster. But technology such as Windows Storage Spaces and VMware’s VSAN make it easy for administrators to alter storage configurations to improve performance. At VMWorld in San Francisco I talked about how VSAN technology is the magic pixie dust of software defined storage right now.

 

If you don’t have magic pixie dust then SSDs are an option, but changing storage configuration only makes sense if you know that disk is your bottleneck. Besides, you might be able to avoid reconfiguring storage by taking steps to distribute your I/O across many drives with…

 

Use distinct storage devices for data, logs, and backups

These days I see many storage admins configuring database servers to use one big RAID 10, or OBR10 for short. For a majority of systems out there the use of OBR10 will suffice for performance. But there are times you will find you have a disk bottleneck as a result of all the activity hitting the array at once. Your first step is then to separate out the database data, log, and backup files onto distinct drives. Database backups should be off the server. Put your database transaction log files onto a different physical array. Doing so will reduce your chance for data loss. After all, if everything is on one array, then when that array fails you will have lost everything.

 

Another option is to break out tempdb onto distinct array as well. In fact, tempdb deserves its own section here…

 

Optimize tempdb for performance

Of course this is only worth the effort if tempdb is found to be the bottleneck. Since tempdb is a shared resource amongst all the databases on the instance it can be a source of contention. But we operate in a world of shared resources, so finding tempdb being a shared resource is not a surprise. Storage, for example, is a shared resource. So are the series of tubes that makes up your network. And if the database server is virtualized (as it should be these days) then you are already living in a completely shared environment. So why not try…

 

Increase the amount of physical RAM available

Of course, this only makes sense if you are having a memory issue. Increasing the amount of RAM is easy for a virtual machine when compared to having to swap out a physical chip. OK, swapping out a chip isn’t that hard either, but you have to buy one, then get up to get the mail, and then bring it to the data center, and…you get the idea.

 

When adding memory to your VM one thing to be mindful about is if your host is using vNUMA. If so, then it could be the case that adding more memory may result in performance issues for some systems. So, be mindful about this and know what to look for.

 

Memory is an easy thing to add to any VM. Know what else is easy to add on to a VM?

 

Increase the number of CPU cores

Again, this is only going to help if you have identified that CPU is the bottleneck. You may want to consider swapping out the CPUs on the host itself if you can get a boost in performance speeds. But adding physical hardware such as a CPU, same as with adding memory, may take too long to physically complete. That’s why VMs are great, as you can make modifications in a short amount of time.

 

Since we are talking about CPUs I would also mention to examine the Windows power plan settings, this is a known issue for database servers. But even with virtualized servers resources such as CPU and memory are not infinite…

 

Reconfigure VM allocations

Many performance issues on virtualized database servers are the result of the host being over-allocated. Over-allocation by itself is not bad. But over-allocation leads to over-commit, and over-commit is when you see performance hits. You should be conservative with your initial allocation of vCPU resources when rolling out VMs on a host. Aim for a 1.5:1 ratio of vCPU to logical cores and adjust upwards from there always paying attention to overall host CPU utilization. For RAM you should stay below 80% total allocation, as that allows room for growth and migrations as needed.

 

You should also take a look at how your network is configured. Your environment should be configured for multi-pathing. Also, know your current HBA queue depth, and what values you want.

 

Summary

We’ve all had times where we’ve been asked to fix performance issues without changing code. The items listed above are options for you to examine and explore in your effort to improve performance before changing code. Of course it helps if you have an effective database performance monitoring solution in place to help you make sense of your environment. You need to have performance metrics and baselines in place before you start turning any "nerd knobs", otherwise you won't know if you are have a positive impact on performance no matter which option you choose.

 

With the right tools in place collecting performance metrics you can then understand which resource is the bottleneck (CPU, memory, disk, network, locking/blocking). Then you can try one or more of the options above. And then you can add up the amount of money you saved on new hardware and put that on your performance review.

Last week, we talked about monitoring the network from different perspectives. By looking at how applications perform from different points in the network, we get an approximation of the users' experience. Unfortunately, most of those tools are short on the details surrounding why there's a problem or are limited in what they can test.

On one end of our monitoring spectrum, we have traditional device-level monitoring. This is going to tell us everything we need to know that is device-specific. On the other end, we have the application-level monitoring discussed in the last couple of weeks. Here, we're going to approximate a view of how the end users see their applications performing. The former gives us a hardware perspective and the latter gives us a user perspective. Finding the perspective of the network as a whole is still somewhere between.

Using testing agents and responders on the network at varying levels can provide that intermediate view. They allow us to test against all manner of traffic, factoring in network latency and variances (jitter) in the same.

Agents and Responders

Most enterprise network devices have built-in functions for initiating and responding to test traffic. These allow us to test and report on the latency of each link from the device itself. Cisco and Huawei have IP Service Level Agreement (SLA) processes. Juniper has Real-Time Performance Monitoring (RPM) and HPE has its Network Quality Analyzer (NQA) functions, just to list a few examples. Once configured, we can read the data from them via Simple Network Management Protocol (SNMP) and track their health from our favourite network monitoring console.

Should we be in the position of having an all-Cisco shop, we can have a look at SolarWinds' IP SLA Monitor and VoIP and Network Quality Manager products to simplify setting things. Otherwise, we're looking at a more manual process if our vendor doesn't have something similar.

Levels

Observing test performance at different levels gives us reports of different granularity. By running tests at the organization, site and link levels, we can start with the bigger picture's metrics and work our way down to specific problems.

Organization

Most of these will be installed at the edge devices or close to them. They will perform edge-to-edge tests against a device at the destination organization or cloud hosting provider. There shouldn't be too many of these tests configured.

Site

Site-to-site tests will be configured close to the WAN links and will monitor overall connectivity between sites. The point of these tests is to give a general perspective on intersite traffic, so they shouldn't be installed directly on the WAN links. Depending on our organization, there could be none of these or a large number.

Link

Each network device has a test for each of its routed links to other network devices to measure latency. This is where the largest number of tests are configured, but is also where we are going to find the most detail.

Caveats

Agent and responder testing isn't passive. There's always the potential for unwanted problems caused by implementing the tests themselves.

Traffic

Agent and responder tests introduce traffic to the network for purposes of testing. While that traffic shouldn't be significant enough to cause impact, there's always the possibility that it will. We need to keep an eye on the interfaces and queues to be sure that there isn't any significant change.

Frequency and Impact

Running agents and responders on the network devices themselves are going to generate additional CPU cycles. Network devices as a whole are not known for having a lot of processing capacity. So, the frequency for running these tests may need to be adjusted to factor that in.

Processing Delay

Related to the previous paragraph, most networking devices aren't going to be performing these tests quickly. The results from these tests may require a bit of a "fudge factor" at the analysis stage to account for this.

The Whisper in the Wires

Having a mesh of agents and responders at the different levels can provide point-in-time analysis of latencies and soft failures throughout the network. But, it needs to be managed carefully to avoid having negative impacts to the network itself.

Thanks to Thwack MVP byrona for spurring some of my thinking on this topic.

Is anyone else building something along these lines?

For government agencies, network monitoring has evolved into something extremely important, yet unnecessarily complex. For instance, according to Gleanster Research, 62 percent of respondents use on average three separate monitoring tools to keep their networks safe and functioning properly.

 

Network monitoring tools have become an integral part of agencies’ IT infrastructures, as they allow administrators to more easily track overall network availability and performance. All of this can be handled in real-time and with accompanying alerts, making network monitoring a must for agencies seeking to bolster their security postures.

 

Below, we’ll break down three monitoring techniques that will help you get a handle on how effective network monitoring can solve numerous problems for your agency.

 

Slay Problems through IP SLA

 

IP SLA – or short for Internet Protocol Service Level Agreements – sounds complex. But in reality its function is a simple one: ensuring the voice-over-IP (VoIP) environment is healthy. IP SLA allows IT administrators to set up certain actions to occur on a network device and have the results of that operation reported back to a remote server.

 

For example, the operation may include checking if a Web page or DNS server is responding, or whether a DHCP server is responding and handing out IP addresses. This is a huge asset because it uses the existing devices within the network infrastructure rather than requiring you to set up separate devices (or agents on existing PCs or servers) to run tests.

 

Trace the NetFlow of “Conversations”

 

NetFlow has the ability to capture network “conversations” for you. NetFlow data is captured by one or more routers operating near the center of the network.

 

Simply put, if DesktopComputer_123 is sending a file to Server_ABC via FTP, that is one conversation. The same PC browsing a webpage on the same server using HTTP is another conversation. NetFlow operates in the middle of these conversations to collect data so that the monitoring server can then aggregate, parse, and analyze the data.

 

Hook Into API Monitoring

 

Using a network monitoring Application Protocol Interface (API) can be the murkiest of all of the techniques we’ve discussed. In essence, to understand how API is used, you must realize that there are hooks built into applications that allow for data requests. Each time this type of request is received, a response is sent back to the monitoring software, giving you a better understanding of how your network is performing. Microsoft System Center Operations Manager (SCOM) is a proprietary example of a network monitoring API, while VMware’s API is published and generally available.

 

Make no mistake — maintaining network security in today’s environment is more complex and crucial than ever. Having the tools in place – and understanding what tools are out there for federal government agencies – is a must.  But the good news is that these tools do exist.  And with less work than you may have expected, you can quickly understand and appreciate what you can do to crack the case of network security.

 

Find the full article on our partner DLT’s blog, TechnicallySpeaking.

Over the past 5 postings, I’ve talked about some trends that we have seen happening and gaining traction within the Cloud space. I’ve spoken of :

 

  • Virtualization – Where established trends toward virtualization, particularly VMWare, have been challenged by a variety of newcomers, who’s market share continues to grow. Most notable here is OpenStack as a hypervisor. VMWare has challenged the threat of Azure, AWS, and true OpenStack by embracing it with a series of API’s meant to incorporate the on-prem Virtual DataCenter with those peers in the hybrid space.

 

  • Storage – In the case of traditional storage, the trend has been to faster, with faster Ethernet or Fibre as interconnect, and of course, Solid State is becoming the norm in any reasonably high IO environment. But the biggest sea change is becoming that of Object based storage. Object really is a different approach, with replication, erasure encoding, and redundancy built-in.

 

  • Software Defined Networking – Eating quite drastically into the data center space these days is SDN. The complexities in routing tables, and firewall rules are being addressed within the virtual data center by tools like ACI (Cisco) and NSX (VMWare). While port reduction isn’t quite the play here. The ability to segment a network via these rules far surpasses any physical switch’s capacities. In addition, these rules can be rolled out quite effectively, accurately, and with easy roll-back. I find that these two pieces are truly compelling to the maintaining and enhancing the elegance of the network, while reducing the complexities laid onto the physical switch environment.

 

  • Containers – In the new world of DevOps, Containers, a way to disaggregate the application from the operating system, have proven yet another compelling way into the future. DevOps calls for the ability to update parts and pieces of an application, while Containers allow for the ability to scale the application, update it, and deploy it wherever and whenever you want.

 

  • Serverless and MicroServices – Falling into the equation of DevOps, where small components compiled together make up for the entire application, put together as building-blocks make the whole of the application quite dynamic, and modifiable. While the “Serverless” piece, which is somewhat a misnomer (due to the fact that any workload must reside on some compute layer), are dynamic, movable, and reliant less on the hypervisor or location, than wherever the underlying architecture actually resides.

 

So… What’s next in the data center infrastructure? We’ve seen tools that allow the data center administrator to easily deploy workloads into a destination wherever that may be, we’ve seen gateways that bridge the gap from more traditional storage to object based, we’ve seen orchestration tools which allow for the rapid, consistent, and highly managed deployment of containers in the enterprise/cloud space, and we’ve seen a truly cross-platform approach to serverless/MicroService type of architecture which eases the use of a newer paradigm in the data center.

 

What we haven’t seen is a truly revolutionary unifier. For example, when VMWare became the juggernaut it did become, the virtualization platform became the tool that tied everything together. Regardless of your storage, compute (albeit X86 particularly) and network infrastructure, with VMWare as a platform, you had one reliable and practically bulletproof tool with which to deploy new workloads, manage existing platforms, essentially scale it up or down as required, and all through the ease of a simple management interface. However, with all these new technologies, will we have that glue? Will we have the ability to build entire architectures, and manage them easily? Will there be a level of fault tolerance; an equivalent to DRS, or Storage DRS? As we seek the new brass ring and poise ourselves onto the platforms of tomorrow, how will we approach these questions?

 

I’d love to hear your thoughts.

Part 2 of a 3-part series, which is itself is a longer version of a talk I give at conferences and conventions.

You can find part 1 here.

I'd love to hear your thoughts in the comments below!

 

In the first part of this series, I made a case for why disconnecting some times and for some significant amount of time is important to our health and career. In this segment I pick up on that idea with specific things you can do to make going offline a successful and positive experience.

 

Don’t Panic!

If you are considering taking time to unplug, you probably have some concerns, such as:

  • how often and for how long should you unplug
  • how do you  deal with a workload that is already threatening to overwhelm you
  • how will your boss, coworkers, friends perceive your decision to unplug
  • how do you maintain your reputation as a miracle worker if you aren’t connected
  • how do you deal with pseudo medical issues like FOMO
  • what about sev1 emergencies
  • what if you are on-call

 

Just take a deep breath. This isn't as hard as you think.

 

Planning Is Key

"To the well-organized mind, death is but the next great adventure."

- Albus Dumbledore

 

As true as these words might be for Nicolas Flamel as he faces his mortality, they are even truer for those shuffling off the mortal coil of internet connectivity. Because, like almost everything else in IT, the decisions you make in the planning phase will determine the ultimate outcome. Creating a solid plan can make all the difference between experiencing boring, disconnected misery and relaxed rejuvenation.

 

The first thing to plan out is how long you want to unplug, and how often. My advice is that you should disconnect as often, and for as long per session, as you think is wise. Period. It's far more important to develop the habit of disconnecting and experience the benefits than it is to try to stick to some one-size-fits-most specification.

 

That said, be reasonable. Thirty minutes isn't disconnecting. That’s just what happens when you're outside decent cell service. You went offline for an hour? I call that having dinner with Aunt Frieda, the one who admonishes you with a “My sister didn't raise you to have that stupid thing out at the table." Haven't checked Facebook for two or three hours? Amateur. That's a really good movie, or a really, REALLY good date.

 

Personally, I think four hours is a good target. But that's just me. Once again, you have to know your life and your limits.

 

At the other end of the spectrum, unless you are making some kind of statement, dropping off the grid for more than a day or two could leave you so shell shocked that you'll avoid going offline again for so long you may as well have never done it.

 

One suggestion is to try a no-screens-Sunday-morning every couple of weeks, and see how it goes. Work out the bugs, and the re-evaluate to see if you could benefit from extending the duration.

 

It's also important to plan ahead to decide what counts as online for you. This is more nuanced that it might seem. Take this seemingly clear-cut example: You plan to avoid anything that connects to the outside world, including TV and radio. There are still choices. Does playing a CD count? If so, can you connect to your favorite music streaming service since it’s really just the collection of music you bought? What about podcasts?

 

The point here is that you don’t need to have the perfect plan. You just need to start out with some kind of plan and be open-minded and flexible enough to adjust as you go.

 

You also need to plan your return to the land of the connected. If turning back on again means five hours of hacking through email, twitter feeds, and Facebook messages, then all that hard won rest and recharging will have gone out the window. Instead, set some specific parameters for how you reconnect. Things like:

  • Limit yourself to no more than 30 minutes of sorting through email and deleting garbage
  • Another 30 to respond to critical social media issues
  • Decide which social media you actually HAVE to look at (Do you really need to catch up on Pinterest and Instagram NOW?)
  • If you have an especially vigorous feed, decide how far back (in hours) that you will scroll

 

As I said earlier, any good plan requires flexibility. These plans are more contingencies than tasks, and you need to adhere to a structure, but also go with the flow when things don't turn out exactly as expected.

 

Preparation is Key

Remember how I said that Shabbat didn't mean sitting in the dark eating cold sandwiches? Well, the secret is in the preparation. Shabbat runs from Friday night to Saturday night, but a common saying goes something like, "Shabbat begins on Wednesday.” This is because you need time to get the laundry done and food prepared so that you are READY when Friday night arrives.

 

An artist friend of mine goes offline for one day each week. I asked him what happens if he gets an idea in the middle of that 24-hour period. He said, "I make an effort all week to exhaust myself creatively, to squeeze out every idea that I can. That way I look at my day off as a real blessing. A day to recharge because I need it."

 

His advice made me re-think how I use my time and how I use work to set up my offline time. I ask myself whether the work I'm doing is the stuff that is going to tear my guts out when I'm offline if it's not done. I also use a variety of tools - from electronic note and to-do systems to physical paper - so that when it's time to drop offline, I have a level of comfort that I'm not forgetting anything, and that I'll be able to dive back in without struggling to find my place.

 

Good preparation includes communicating your intentions. I'm not saying you should broadcast it far and wide, but let key friends, relatives, and coworkers know that you will be “…out of data and cell range.”

 

This is exactly how you need to phrase it. You don’t need to explain that you are taking a day to unplug. That's how the trouble starts. Tell people that you will be out of range. Period.

 

If needed, repeat that phrase slowly and carefully until it sounds natural coming out of your mouth.

 

When you come back online, the opposite applies. Don't tell anyone that you are back online. Trust me, they'll figure it out for themselves.

 

In the next installment, I'll keep digging into the specifics of how to make going offline work for you. Meanwhile, if you have thoughts, suggestions, or questions, let me know in the comments below!

Filter Blog

By date: By tag: