Skip navigation
1 2 3 4 Previous Next

Geek Speak

1,986 posts

Part 2 of a 3-part series, which is itself is a longer version of a talk I give at conferences and conventions.

You can find part 1 here.

I'd love to hear your thoughts in the comments below!

 

In the first part of this series, I made a case for why disconnecting some times and for some significant amount of time is important to our health and career. In this segment I pick up on that idea with specific things you can do to make going offline a successful and positive experience.

 

Don’t Panic!

If you are considering taking time to unplug, you probably have some concerns, such as:

  • how often and for how long should you unplug
  • how do you  deal with a workload that is already threatening to overwhelm you
  • how will your boss, coworkers, friends perceive your decision to unplug
  • how do you maintain your reputation as a miracle worker if you aren’t connected
  • how do you deal with pseudo medical issues like FOMO
  • what about sev1 emergencies
  • what if you are on-call

 

Just take a deep breath. This isn't as hard as you think.

 

Planning Is Key

"To the well-organized mind, death is but the next great adventure."

- Albus Dumbledore

 

As true as these words might be for Nicolas Flamel as he faces his mortality, they are even truer for those shuffling off the mortal coil of internet connectivity. Because, like almost everything else in IT, the decisions you make in the planning phase will determine the ultimate outcome. Creating a solid plan can make all the difference between experiencing boring, disconnected misery and relaxed rejuvenation.

 

The first thing to plan out is how long you want to unplug, and how often. My advice is that you should disconnect as often, and for as long per session, as you think is wise. Period. It's far more important to develop the habit of disconnecting and experience the benefits than it is to try to stick to some one-size-fits-most specification.

 

That said, be reasonable. Thirty minutes isn't disconnecting. That’s just what happens when you're outside decent cell service. You went offline for an hour? I call that having dinner with Aunt Frieda, the one who admonishes you with a “My sister didn't raise you to have that stupid thing out at the table." Haven't checked Facebook for two or three hours? Amateur. That's a really good movie, or a really, REALLY good date.

 

Personally, I think four hours is a good target. But that's just me. Once again, you have to know your life and your limits.

 

At the other end of the spectrum, unless you are making some kind of statement, dropping off the grid for more than a day or two could leave you so shell shocked that you'll avoid going offline again for so long you may as well have never done it.

 

One suggestion is to try a no-screens-Sunday-morning every couple of weeks, and see how it goes. Work out the bugs, and the re-evaluate to see if you could benefit from extending the duration.

 

It's also important to plan ahead to decide what counts as online for you. This is more nuanced that it might seem. Take this seemingly clear-cut example: You plan to avoid anything that connects to the outside world, including TV and radio. There are still choices. Does playing a CD count? If so, can you connect to your favorite music streaming service since it’s really just the collection of music you bought? What about podcasts?

 

The point here is that you don’t need to have the perfect plan. You just need to start out with some kind of plan and be open-minded and flexible enough to adjust as you go.

 

You also need to plan your return to the land of the connected. If turning back on again means five hours of hacking through email, twitter feeds, and Facebook messages, then all that hard won rest and recharging will have gone out the window. Instead, set some specific parameters for how you reconnect. Things like:

  • Limit yourself to no more than 30 minutes of sorting through email and deleting garbage
  • Another 30 to respond to critical social media issues
  • Decide which social media you actually HAVE to look at (Do you really need to catch up on Pinterest and Instagram NOW?)
  • If you have an especially vigorous feed, decide how far back (in hours) that you will scroll

 

As I said earlier, any good plan requires flexibility. These plans are more contingencies than tasks, and you need to adhere to a structure, but also go with the flow when things don't turn out exactly as expected.

 

Preparation is Key

Remember how I said that Shabbat didn't mean sitting in the dark eating cold sandwiches? Well, the secret is in the preparation. Shabbat runs from Friday night to Saturday night, but a common saying goes something like, "Shabbat begins on Wednesday.” This is because you need time to get the laundry done and food prepared so that you are READY when Friday night arrives.

 

An artist friend of mine goes offline for one day each week. I asked him what happens if he gets an idea in the middle of that 24-hour period. He said, "I make an effort all week to exhaust myself creatively, to squeeze out every idea that I can. That way I look at my day off as a real blessing. A day to recharge because I need it."

 

His advice made me re-think how I use my time and how I use work to set up my offline time. I ask myself whether the work I'm doing is the stuff that is going to tear my guts out when I'm offline if it's not done. I also use a variety of tools - from electronic note and to-do systems to physical paper - so that when it's time to drop offline, I have a level of comfort that I'm not forgetting anything, and that I'll be able to dive back in without struggling to find my place.

 

Good preparation includes communicating your intentions. I'm not saying you should broadcast it far and wide, but let key friends, relatives, and coworkers know that you will be “…out of data and cell range.”

 

This is exactly how you need to phrase it. You don’t need to explain that you are taking a day to unplug. That's how the trouble starts. Tell people that you will be out of range. Period.

 

If needed, repeat that phrase slowly and carefully until it sounds natural coming out of your mouth.

 

When you come back online, the opposite applies. Don't tell anyone that you are back online. Trust me, they'll figure it out for themselves.

 

In the next installment, I'll keep digging into the specifics of how to make going offline work for you. Meanwhile, if you have thoughts, suggestions, or questions, let me know in the comments below!

dsmain.gif

(image courtesy of Marvel)

 

...I learned from "Doctor Strange"

(This is part 1 of what will be a 4-part series. Enjoy!)

 

"When the student is ready, the teacher appears," is a well-known phrase, but I was struck recently by the way that sometimes the teacher appears in unexpected forms. It's not always the kindly and unassuming janitor, Mr. Miyagi, or the crazy old hermit, Ben Kenobi. Sometimes the teacher isn’t a person or a character, but an entire movie filled with lessons for ready students. 

 

I found myself in that situation recently, as I sat watching Dr. Strange, the latest installment in the Marvel cinematic universe.

 

There, hidden among the special effects, panoramic vistas, and Benedict Cumberbatch's cheekbones were some very real and meaningful IT career lessons, applicable to both acolytes and masters as they walk the halls of your own technological Kamar Taj. In fact, I discovered a series of lessons, more than I can fit into just one essay.

 

So, over the next couple of installments I'm going to share them with you, and I’d like to hear your thoughts and reactions in the comments below.

 

If it needs to be said, there are many spoilers in what follows. If you haven't seen the movie yet, and don't want to know what's coming, bookmark this page to enjoy later.

 

Know the essential tools of the trade

The movie introduces us to the concept of a sling ring, a magical device that allows a sorcerer to open a portal to another location. In the narrative arc of the movie, this appears to be one of the first and most basic skills sorcerers are taught. It was also the key to many of the plot twists and a few sight gags in the movie. In my mind, I equated the concept of the sling ring with the idea that all IT pros need to understand and master basic skills, such as IP subnetting, command line syntax, coding skills, and security.

 

Can you be a solid IT pro without these skills? Sure, but you'll never be a master, and odds are good that you'll find yourself hanging around the lower end of the career ladder far longer than you’d like.

 

Think creatively about how to use the technology you already have

In the movie, immediately after figuring out how to use a sling ring, we see the hero use it in non-standard ways. Instead of opening a portal for his whole body, he opens holes just big enough for his hands, so that he can borrow books from the library and avoid being detected by Wong the librarian. We see this again in the use of the Eye of Agamotto during Doctor Strange's face-off against Dormamu.

 

The great thing about essential IT skills is that they can be used in so many ways. Understanding network routing will allow you to build stronger and more secure environments in the cloud. A grasp of regular expressions will help you in coding, in using various tools, and more. Understanding the command line, rather than being trapped in the GUI all the time, allows you to automate tasks, perform actions more quickly, and extend functionality.

 

It's worth noting that here at SolarWinds we place great stock in enabling our users to think outside the box. We even have a SolarWinds User Group (SWUG) session on doing just that – called “Thinking Outside the Box”.

 

Don't let your desire for structure consume you

In the movie, Mordo began as an ally, and even friend, of Stephen Strange, but displayed certain issues throughout the movie. In claiming he had conquered his demons, the Ancient One replied, "We never lose our demons. We only learn to live above them."

 

Mordo’s desire to both protect the natural order and remain steadfastly within its boundaries proved his undoing, with him leaving the sorcerers of Kamar Taj when he found that both the Ancient One and Doctor Strange had bent rules in order to save the world.

 

I find this relevant when I see seasoned IT pros forcing themselves to operate within constraints that don't exist, except in their own minds. When I hear IT pros proclaim that they would never run (name your operating system, software package, or hardware platform) in their shop, it's usually not for any sound business reason. And when those standards are challenged, I have watched more than a few seasoned veterans break rather than bend. It's not pretty, and it's also not necessary.

 

There are never too many sorcerers in the world

Mordo's reaction is extreme. He begins hunting down other practitioners of the magical arts and taking their power, proclaiming, "There are too many sorcerers in the world!"

 

There are times in IT when it feels like EVERYONE is trying to become a (again, fill in your technology or specialty here) expert. And it's true that when a whole crop of new folks come into a discipline, it can be tiresome watching the same mistakes being made, or having to explain the same concepts over and over.

 

But the truth is that there are never enough sorcerers, or in our case, specialists, in the world. There's plenty of work to go around. And the truth is that not everyone is cut out for some of these specialties, and they soon find themselves overwhelmed and leave – hopefully to find an area of IT that suits them better.

 

While I don't expect that anyone reading this will magically extract the IT power from their peers, I have watched coworkers shoot down or even sabotage the work of others just so they can maintain their own privileged status. I'm happy to say that this tactic rarely works, and never ends well.

 

Persistence often pays off

At one point in the movie, the Ancient One sends Strange on a trip through alternate dimensions, then asks, "Have you seen that at a gift shop?" When Strange begs her to teach him, her response is a firm “no.” Hours later, Strange is wailing at the door, begging to be let in.

 

At some point in your career, you may have an epiphany and realize that your career goals point you toward a certain technology or discipline. And, just your luck, there's a team that specializes in exactly that! So you go to the manager or team lead and ask if you can join up.

 

Your first request to join the team may fall on deaf ears. And your second. You may need to hang, like a sad puppy dog, around them in the lunchroom or around the water cooler for a while. Unlike Doctor Strange, it may take weeks or even months of persistence, rather than a few hours. But that doesn't mean it's not worth it.

 

Did you find your own lesson when watching the movie? Discuss it with me in the comments below. And keep an eye out for parts 2-4, coming in the following weeks.

The series is a general interest piece and not related to SolarWinds products in any way, nor will it be used to promote Solarwinds products.

 

It will be hosted on THWACK.com, the free, open user community for monitoring experts.

 

Can you

Tomorrow is Thanksgiving here in the USA. I have much to be thankful for but these days I am most thankful that jennebarbour continues to let me write this series each and every week.

 

So, in that spirit, here's a bunch of links I found on the Intertubz that you may find appetizing, enjoy!

 

AOL is laying off 500 employees in a restructuring with focus on mobile, data and video

My first thought to this was "AOL still has employees?"

 

How to eat as much food as humanly possible this Thanksgiving

For those of us in IT that don't already know how to eat way more than necessary, here's a list to help.

 

Nothing Personal but I'm Taking Your Job

I've said it before, and I will say it again: If you aren't trying to automate your job away, someone else will do it for you.

 

6 links that will show you what Google knows about you

If you were curious to see yourself as Google sees you.

 

How To Ask A Question At A Conference

After a busy event season this is a nice reminder on how to be polite when asking questions during a session. Nobody wants to see you strut (HT to datachick).

 

Live Streaming Web Cam Views from Around the World

I'm wondering how many of these webcams are meant to be public, and how many are simply the result of the owner having no idea.

 

Eat, Fry, Love

If you haven't seen this video yet, you should. It's a great PSA about the dangers of deep-frying a turkey.

 

It won't happen this year, but that won't stop me from dreaming about this:

turbaconducken.jpg

 

Happy Thanksgiving!

Last week we talked about application-aware monitoring. Rather than placing our focus on the devices and interfaces, we discussed getting data that approximates our users' experiences. These users, are going to be distributed around the organization at least.  They may even be scattered around the Internet, depending on the scope of our application. We need to examine application performance from different perspectives to get a complete picture.

Any way we look at it, we're going to need active remote probes/agents to accomplish what we're looking for. Those should be programmable to emulate application behaviour, so that we can get the most relevant data. At the least, having something that can measure basic network performance from any point on the network is necesary. There are a few options.

NetPath

Last week, I was invited to Tech Field Day 12 as a delegate and had the opportunity to sit in on the first session of Networking Field Day 13 as a guest. Coincidentally, SolarWinds was the first presenter. Even more coincidentally, they were showing off the NetPath feature of Network Performance Monitor (NPM) 12. This product, while not yet fully programmable to emulate specific applications, provides detailed hop-by-hop analysis from any point at which an agent/probe can be placed. In addition, it maintains a performance history for those times when we get notification of a problem well after the fact. For those of you working with NPM 12, I'm going to recommend you have a very close look at NetPath as a beginning for this sort of monitoring. One downside of the NetPath probes is the requirement to have a Windows Professional computer running at each agent location. This makes it a heavier and more costly option, but well worth it for the information that it provides. Hopefully, the SolarWinds folks will look into lightweight options for the probe side of NetPath in the future. We're only at 1.0, so there's a lot of room for growth and development.

Looking at lighter, though less full-featured options, we have a few. They're mostly roll-your own solutions, but this adds flexibility at the cost of ease.

Lightweight VMs and ARM Appliances

If there's a little bit of room on a VM somewhere, that's enough space for a lightweight VM to be installed. Regular application performance probes can be run from these and report directly to a monitoring station via syslog or SNMP traps. These custom probes can even be controlled remotely by executing them via SSH.

In the absence of VM space, the same sort of thing can be run from a small ARM computer, like a Raspberry Pi. The probe device itself can even be powered by the on-board USB port of another networking device nearby.

Going back to NetPath for a moment, one option for SolarWinds is to leverage Windows Embedded and/or Windows IoT as a lightweight option for NetPath probes. This is something I think would be worth having a look at.

On-device Containers

A few networking companies (Cisco's ISR 4K line, for example) have opened up the ability to run small custom VMs and containers on the device itself. This extends the availability of agents/probes to locations where there are no local compute resources available.

Built-in Router/Switch Functions

Thwack MVP byrona had a brilliant idea with his implementation of IP SLA in Cisco routers and having Orion collect the statistics, presumably via SNMP. This requires no additional hardware and minimal administrative overhead. Just set up the IP SLA process and read the statistics as they're generated.

The Whisper in the Wires

NetPath is looking like a very promising approach to monitoring from different points of view. For most other solutions, we're unfortunately still mostly at the roll-your own stage. Still, we're seeing some promising solutions on the horizon.

What are you doing to get a look at your application performance from around the network?

devopsdays_ohio.png

 

I wanted to share some of the things I heard and saw during the incredible two days I spent with 300+ attendees at DevOps Days Ohio.

 

First, I have to admit that after more than a year of attending DevOpsDays around the country, I'm still working on my own definition of what DevOps is, and how it compares and contrasts with some of the more traditional operations. But this event helped gel a number of things for me.

 

What I realized, with the help of this article (which came out while I was at the conference), is that my lack of clarity is okay, because sometimes the DevOps community is also unclear on what they mean.

 

One of the ongoing points of confusion for me is the use of words I think I know, but in a context that tells me it means something else. Case in point: configuration management. In my world, that means network device configurations, specifically for backing up, comparing, auditing, and rolling out. But then I hear a pronouncement that, "Config management is code," and, "If you are working on configs, you are a developer now." And most confusingly, "To do config management right, you need to be on Git."

If this has ever struck you as strange, then you (and I) need to recognize that to the DevOps community, the server (and specifically the virtualized server) is king, and the config management they're talking about is the scripted creation of a new server in on-premises or cloud-based environments.

 

This led to some hilarious interactions for me, including a side conversation where I was talking about on-call emergencies and the other person said, "I don't know why on-call is even a thing any more. I mean, if a system is having a problem, you should just delete it and rebuild it from code, right? Humans don't need to be involved at all."

 

To which I replied, "Interesting idea, but to my knowledge it's very difficult to delete and re-build a router with a bad WIC using nothing but code."

 

The reply? "Oh, well, yeah, there's that."

 

The point of this is not that DevOps-focused IT pros are somehow clueless to the realities of the network, but that their focus is so intensely trained on optimizing the top end of the OSI model, that we monitoring experts need to allow for that, and adjust our dialogue accordingly.

 

I was honestly blown away to learn how far DevOps culture has made in-roads, even into traditionally risk-averse environments, such as banking. I worked at a bank between 2006 and 2009, right in the middle of the home mortgage crisis, and I could never imagine something like DevOps taking hold. But we heard from folks at Key Bank who spoke openly about the concerns, challenges, and ultimately successes that their shift to DevOps has garnered them, and I saw that the value that cloud, hybrid IT, micro-services, and agile development holds for business that are willing to consider it within the context of their industry, and implement it rationally and thoughtfully.

 

I was also heartened to hear that monitoring isn't being overlooked. One speaker stated flat out that having monitoring in place is table stakes for rolling out micro-services. This shows an appreciation for the skills we monitoring engineers bring to the table, and presages a potential new avenue for people who simply have monitoring as a bullet item on their to do list to make the leap into a sub-specialization.

 

There is a lot of work to do, in the form of education, for monitoring specialists and enthusiasts. In one-on-one conversations, as well as in OpenSpace discussions, I found experienced DevOps folks conflating monitoring with alerting; complaining about alerts as noise, while demonstrating a lack of awareness that alerts could be tuned, de-duplicated, or made more sophisticated, and therefore more meaningful; and overlooking the solutions of the past simply because they believed new technology was somehow materially different. Case in point, I asked why monitoring containers was any harder or even different from monitoring LPARs on AIX, and got nervous chuckles from the younger folks, and appreciative belly laughs from some of the old timers in the room.

 

However, I came to the realization that DevOps does represent a radical departure for monitoring engineers in its "Cattle, not Pets" mentality. When an entire server can be rebuilt in the blink of an eye, the best response to a poorly behaving service is truly not to fix the issue. That attitude alone may take time for those of us who may be mired in biases based in the old days of bare-metal hardware and servers we named after the Brady Bunch or Hobbit dwarves.

 

Overall, I am excited for the insights that are finally gelling in my mind, and look forward to learning more and becoming a more fluent member of the DevOps community, especially during my upcoming talk at DevOpsDays Tel Aviv!

 

One final thing: I gave an Ignite talk at this conference and found the format (five minutes, 20 slides that auto-advance every 15 seconds), to be both exhilarating and terrifying. I'm looking forward to my next chance to give one.

Staying one step ahead of hackers trying to infiltrate an IT environment is challenging. It can be nearly impossible if those tasked with protecting that environment don’t have visibility across all of the systems and infrastructure components. Using unified monitoring software gives integrated cross-domain visibility and a solid view of the whole environment.

 

Let’s take a look at an attack scenario

Perhaps a hacker gains access through a Web application with a structured query language-injection attack against a database server. The attack compromises the database and exfiltrates data or gains credentials.

 

With access to the local database or server, the attacker can drop malware that could reverse an administrative session and gain access to other parts of the infrastructure, including routers, switches and firewalls. Attack evidence would likely be found in various places within the environment; such evidence might not trigger an alert, but taken together, these events clearly signal a problem.

 

Visibility leads to quick resolution

With comprehensive monitoring tools, clear insight and consistent education throughout the IT team and all agency personnel, the task can seem less daunting.

 

The tools

First, make sure monitoring tools are in place to provide deep visibility. These include the following:

 

  • Endpoints- User device tracking will provide information about where devices are located, how they connect to the network and who uses them.
  • Data- Make sure you have monitoring in place that will detect and block malicious file transfer activities and software designed to securely transfer and track files coming into and going out of the agency.
  • Patching- In large environments, something always needs to be updated. Therefore, it is important to use software that automatically patches servers and workstations.
  • Servers and applications- Always monitor server and application performance. This will help you find service degradation that could indicate an intrusion.
  • Databases- Create performance baselines for databases to ensure that any anomalies are registered.
  • Systems- Deep visibility into virtual machines and storage devices can provide insight into the root cause of any performance change.
  • Networks- Traffic analysis, firewall and router monitoring, and configuration compliance and optimization are all critical to ensuring the integrity of a network.

 

The knowledge

Once these tools are monitoring what they should, the resulting data needs to be fed into a consolidated view where it can be correlated and analyzed as a whole. Doing so lets IT pros quickly and decisively identify potential threats and take action where needed.

 

The training

Finally, it is important to make sure that the people who work on the network receive detailed security training. Making everyone aware of the seriousness of an attack and the role each worker plays in practicing good cyber hygiene—from the IT team to finance and public affairs—can go a long way in creating a more secure agency.

 

There is no one-size-fits-all solution when it comes to security, and attacks are becoming harder to prevent. That said, implementing the right tools, combining insights across domains and providing in-depth, regular training can improve detection and response capabilities.

 

Find the full article on Signal.

For the last couple years, the single hottest emerging trend in technology, a topic of conversation, the biggest buzzword, and a key criterion for designing both hardware and application bases has been the concept of containers.

 

At this point, we have approaches from Docker, Google, Kubernetes (k8s), Mesos, and notably, Project Photon from VMware. While discretely, there are differentiations on all fronts, the concept is quite similar. The container, regardless of the flavor, typically contains the packaged, migratable, and complete or component parts of the application. These containers work as workloads in the cloud, and allow for the ability to take that packaged piece and run it practically anywhere.

 

This is in direct contrast to the idea of virtual machines, which while VM’s can in some ways accomplish the same tasks, but in other ways, they’ve not got the portability to reside as-is on any platform. A VMware based virtual machine can only reside on a VMware host. Likewise Hyper-V, KVM, and OpenStack based VM’s are limited to their native platforms. Now, processes of migrating these VM’s to alternate platforms do exist. But the procedures are somewhat intensive. Ideally, you’d simply place your workload VM’s in their target environment, and keep them there.

 

That model is necessary in many older types of application workloads. Many more modern environments, however, pursue a more granular and modular approach to application development. These approaches allow for a more MicroServices type of concept. They also allow for the packaging and repackaging of these container based functions, and allow for the deployment to be relocated essentially at will.

 

In a truly “Cloud-Based” environment, the functionality and orchestration becomes an issue. As the adoption grows, the management of many containers becomes a bit clumsy, or even overwhelming. The tools from Kubernetes (Originally a Google project, then donated to the Cloud Native Computing Foundation) make the management of these “Pods” (basic scheduling units) a bit less of a difficulty. Tools are regularly expanded, and functionality of these tools grows, as part of an opensource code. Some of the benefits to this approach are that the community can access via tools like GitHub, the primitives and add to, optimize, and enhance them, to which these added functionalities are constantly being updated.

 

Opensource is a crucial piece of the equation. If your organization is not pursuing the agile approach, the “CrowdSourced” model of IT, which in my opinion is closed minded, then this concept is really not for you. But, if you have begun by delivering your code in parts and pieces, then you owe it to yourself to pursue a container approach. Transitions can present their own challenges, but the cool thing is that these new paradigm approaches can be done gradually, the learning curve can be tackled, there is no real outlay for the software, and from a business perspective, the potential beneficial enhancement on the journey to cloud, cloud-native, and agile IT are very real.

 

Do your research. This isn’t necessarily the correct approach to every IT organization, but it may be for yours. Promote the benefits, get yourself on https://github.com, and begin learning how your organization can begin to change your methods to approach this approach to IT management. You will not be sorry you did.

 

  • Some considerations that must be addressed prior to making the decision to move forward:
    Storage – Does your storage environment support containers? In the storage world, Object based is truly important
    • Application – Is your app functional in a micro-services/container based function? Many legacy applications are much too monolithic as to be supportable. Many new DevOps type applications are far more functional

          

          

  I’m sure that there are far more considerations.

This is a longer version of a talk I give at conferences and conventions. I would love to hear your responses, thoughts, and reactions in the comments below.

 

Do You Care About Being Constantly Connected?

For the next few minutes, I dare you to put down your phone, close up your laptop, and set aside your tablet. In fact, I double dog dare you. I've got $20 on the table that says you can't print this, find a quiet corner, and read it, away from any electronic interruptions in the form of beeps, pings, or tweets.

 

I. Triple. Dog. Dare. You.

tongue-to-pole1_commons.jpg

Separating ourselves from our devices, and, more broadly, from the internet, which feeds the devices that have become a lifeline for most of us – has been a topic of conversation for some time now. Recently, columnist Andrew Sullivan wrote a column for New York Magazine about coming to terms with his self-described addiction to technology. In "My Distraction Sickness: Technology Almost Killed Me", Sullivan provides sobering data for those of us who spend some (or most) of our days online:

  1. In just one minute, YouTube users upload 400 hours of video
  2. Tinder users swipe profiles over a million times
  3. Facebook users generate 1 billion likes every day
  4. In their regular SnapChatting career, a typical teen will post, receive, or re-post between 10,000 and 400,000 snaps
  5. A study published last year found that participants were using their phones for up to five hours a day…
  6. ... 85 separate times
  7. ... with most interactions lasting fewer than 30 seconds
  8. ... where users thought they picked up their phones half as often as they actually did
  9. Forty-six percent of the study subjects said they couldn't live without their phone

 

It’s important to recognize that we've arrived at this point in less than a decade. The venerable iPhone, the smartphone that launched a thousand other smartphones, debuted in 2007. Four years late one-third of all Americans owned one. Today, that number is up to two-thirds. If you only count young adults, that figure is closer to eighty-five percent.

 

This all probably comes as a surprise to no one reading this (likely on a smartphone, tablet, or laptop). Equally un-surprising is that, intellectually, we know what our screens are pulling us away from:

 

Life.

 

In his essay "7 Important Reasons to Unplug and Find Space", columnist Joshua Becker wrote:

 

"Life, at its best, is happening right in front of you. These experiences will never repeat themselves. These conversations are unfiltered and authentic."

 

In that same article, Mr. Becker quotes the progenitor of smartphones the digital patron saint of many IT pros, Steve Jobs, who said,

 

“We’re born, we live for a brief instant, and we die. It’s been happening for a long time. Technology is not changing it much – if at all.”

 

But it doesn't stop there. We should already understand what studies are showing:

 

I want to be clear though: this article is NOT about how bad it is to be connected. It would be disingenuous for me, someone who spends a majority of his day in front of a screen, wrote only about how bad it is to be connected. Not to mention, it wouldn't be particularly helpful.

 

My goal is to make it clear why disconnecting, at times and for a significant amount of time, is measurably important to each of us and can have a very real impact on the quality of our life, both online and off.

 

The Secret Society

You've probably read essays suggesting you take a technology cleanse or a data diet, as if the bits and packets of your network have gotten impacted and are now backing up the colon of your brain.

 

If you have heard such suggestions, you may have responded with, "What kind of crazy wing nut actually does that?"

 

Now I’d like to share a little secret with you: I belong to a whole group of wing nuts do this every week. We call this crazy idea Shabbat, or the Sabbath in English, and it is observed by Jews across the world.

jew-jitsu.jpg

(image courtesy Yehoshua Sofer)

 

Before I go any further, you should know that Judaism is not big on converting people so I'm not going to try to get anyone to join the tribe. I'm also not going to ask you to sign up for Amway.

 

On Shabbat, which begins at sundown Friday night and ends at sundown Saturday, anything with an ON switch is OFF limits. It can't be touched, moved, or changed. Yes, leaving the television set to SportsChannel 24 and just happening to walk past it every 10 minutes is cheating. And no, you don't sit in the dark and eat cold sandwiches. But I’ll talk more about that later.

 

But Shabbat only comes into play if you are one of the roughly 600,000 Jews in the United States (or 2.2 million worldwide) who fully observe the Sabbath. Which begs the question: if I'm not going to try to get YOU to be Jewish, where am I going with this?

 

In addition to being part of that crazy group of wing nuts, I've also worked in IT for 30 years. For almost a decade now, I've disconnected every single week, rain or shine, regardless of my job title, company, or on-call rotation. That has given me a unique perspective in tips, tricks, workarounds, and pitfalls.

 

So this is less of a you-should-unplug lecture, and more of a here’s HOW to unplug and not lose your job (or your marriage, or your mind) conversation.

 

Remember, this is just part one of a 3-part series. I'm looking forward to hearing your thoughts, suggestions, and ideas in the comments below!

The View From Above: James (CEO)

 

Another week, another network problem. On Tuesday morning I received an angry call from our CFO, Phyllis, who was visiting our Austin, TX site. The whole network is a mess, she told me, nothing is working properly and I can't do my job. I asked for more detail, but she just said the network was a nightmare and she couldn't even send emails. Great start to the day, especially as Austin is our main manufacturing plant, and if the network was as bad as Phyllis said it was, we were in for a bad week with our supply chain getting out of sync, which could negatively impact both our cashflow and our production output.

 

I called our new Senior Network Manager, Amanda, to let her know that the Austin office was down. She sounded surprised; apparently she had just been talking to the Inventory Management team, and they had been telling her that they were quite pleased with the performance of the company's inventory tool, especially given that it is based out of our data center in Raleigh, NC. I put her in touch with Phyllis and told her to figure out what was going on, because clearly things in Austin weren't going as great as she thought they were.

 

The View From The Trenches: Amanda (Sr Network Manager)

 

Two weeks have passed since I installed Solarwinds' Network Performance Manager, and so far things have been good. I should have guessed that the quiet wouldn't last long, however. I got a call from James around 10AM on Tuesday, and he was mad. Apparently Phyllis was on site in Austin, TX and told him that the network was broken. I knew it wasn't; I was just talking to the Inventory Management team about a project to implement handheld (WiFi) scanners, and they've been testing their old wired scanners in parallel to the WiFi scanners, and both have been working just great, so hopefully both the wireless and wired networks are functioning ok. Still, if Phyllis is upset, it's more than my job's worth to ignore her.

 

Phyllis is without question good at her job, but I get the impression that she would be happier using a large paper ledger and a pot of ink (and maybe even a feather quill pen). Computers are, in her eyes, an irritation, and trying to troubleshoot her problems over the phone is challenging to say the least. However, after a while I did manage to figure out what the problem was. It turns out that everything is down actually meant my email is working intermittently. About 9 months ago we moved our email to Microsoft's Office365, so the mail servers are now accessed via the internet. I confirmed with Phyllis that she was able to access our intranet without issue, which confirmed that our site network was not the problem, (I knew it!), but when she tried accessing the Internet -- including Outlook365 -- she was having problems. It wasn't a total loss of connectivity, but things were slow, and would sometimes lose her connection to the server altogether. Sounds like an Internet issue, but what - and where?

 

Time to fire up a browser to NPM. I checked the basics, but all the network hardware seemed fine, including our Internet routers and edge firewalls, so maybe it was something on the Internet itself. Unfortunately I know how these things work; if I can't prove where the problem is, the assumption is still that it's the network at fault. As I stared at the screen, the phone rang; Phyllis was on the line. I don't know why it took so long, she said, but it looks like whatever you did worked. Finally I can get on with my day's work. And she hung up. Had she stayed on the line I'm not sure if I would have admitted that I'd done nothing, but at least the immediate pressure seemed to be off. But what caused the problem? And worse, now the problem had cleared itself up, there aren't really any tests I could do to troubleshoot. At this point, I remembered NetPath.

 

When I installed NPM, I installed a bunch of probes and set up some monitoring of a number of services to see what it would look like. My idea was that I'd be able to monitor network performance from a few sites, but I got so consumed with setting up device monitoring I pushed that aside for a bit. In the background however, the probes had been faithfully gathering data for me about their connectivity to a number of key sites including -- by incredible good fortune -- the email service. I started off by checking what the NetPath traffic graph looked like right now, when data was successfully flowing to Office365. NetPath had identified that traffic seemed to pass through one of three potential service providers between our Austin site's internet provider and the Office365 servers on the Internet, with the vast majority (around 80%) likely to be sent through TransitCo, a large provider in Texas and the South Central states. At the bottom of the screen was the Path History bar, and it was clear to see that while everything was now green, there was a large chunk of red showing on the timeline for both availability and latency. Time to wind the clock back.

 

Clicking on one of the red blocks, the NetPath display updated and ... whoa ... ok, that explains it. TransitCo's router was lit up in red (along with some attached links) and NetPath was reporting 90% packet loss through that path, and extremely high latency. No wonder Phyllis was having problems staying connected! Data in hand, I called up TransitCo to ask them about their service interruption and they confirmed that an interface had gone bad but the routing engine had for some reason kept on pumping traffic down that link. They had completed a reboot and an interface replacement around 30 minutes earlier, and service was restored. Amazing. Our own Internet provider wouldn't have reported this as it wasn't their direct problem, and there's no way we could sign up for alerts from every other provider just to keep abreast of the outages. If we hadn't had this tool, I'd still be scratching my head wondering what on earth had happened this morning. Still, while I find out a way to get a better handle on upstream provider problems, at least I can now go back and report on the cause and scope of the outage. And maybe I can sell my VP on funding a secondary Internet link out of Austin from another provider, just in case something like this happens again.

 

I've not even had it installed for a month, but Solarwinds NPM saved the day (or my reputation, at least). I think I'll be checking out what other products they have.

A successful help desk seeks to solve incidents quickly, find resolutions to persistent problems, and keep end-users happy. The help desk is the first line of defense triaging tickets and working with end-users directly to fix their technical problems, and this is no easy task.

 

In order to keep ticket queues low and morale high, help desk managers should consider these three key principles:

 

1)     Dedicated People

2)     Established Processes

3)     Centralized Information

 

Dedicated people is the first key principle.

 

A help desk doesn’t necessarily need senior level engineers with advanced degrees and 10 years’ experience. Instead, a solid first line of defense requires a solid team of hard workers who know how to locate information on internal information repositories and how to Google solutions to weird Windows and printer issues. The key here is hard work and dedication. I don’t mean dedication to showing up on time, necessarily, though that’s certainly important. What I mean is a dedication to getting the issue-at-hand resolved.

 

For example, during my first year in IT, I worked on a help desk serving a large government agency. We had hundreds of new tickets in the queue every day. My co-worker, Don, made it his simple goal to close as many tickets per week as he could. Don was already in his 30s and had changed careers from restaurant management, so he didn’t have decades of experience along with advanced computer science degrees and industry certifications. What he did have was a sheer determination to figure out an issue and get the problem fixed. Our end-users loved him and often asked for him specifically. He browsed through our internal wikis and Googled his life away looking for a way to fix an issue, and nearly every time he eventually figured it out.

 

This is what a good help desk needs: people who know how to do basic online research and are dedicated to sticking with an issue until it’s resolved.

 

Having clear, established processes is the second key principle.

 

My friend Don would have had a much more difficult time resolving tickets without the processes in place to enable him to get the job done. For example, a service desk manager must determine how tickets will be logged and organized, how they will be triaged, how they will be escalated, and how to provide quick information to help desk technicians to solve new tickets as they come in.

In my experience this means first finding the right ticket management system. Whether it’s in the cloud or on local servers, a solid ticket management system will make it easy for end-users to submit tickets and for the service desk to organize, triage, and resolve them. I personally prefer a single source of truth in which the ticketing system is not only a way to organize tickets but also an information repository and a method to communicate with end-users. In this way technicians can log into one system and find everything they need to get the job done. Navigating multiple systems and many windows is a sure-fire way to forget (or ignore) tickets and spend way too much time looking up simple information such as license keys or asset locations.

 

Another important part of clear and established help desk processes is accountability. This must be built in to the help desk processes and not just assumed. Tickets get lost, and sometimes they’re ignored. This may be because the help desk is dealing with a huge number of tickets with too few people, but I’ve seen many tickets ignored because they were difficult, long-winded, or because the end-user was a well-known jerk.

 

Rather than have tickets come in from end-users into a general queue, consider having them all go first to a help desk manager or team lead to very quickly triage and be assigned to the appropriate technician. I have seen struggling service desks go from zero to hero implementing just this one simple process.

 

A decent ticketing system will have escalation timers, auto-responders, and many other built-in tools to automate workflow, but don’t rely on the software alone to maintain some semblance of order. This is a top-down process beginning with help desk managers and team leads.

 

Maintaining a centralized, updated information repository is the last key principle.

 

Let’s face it, most companies use Windows computers for their end-users. Yes, I know there are exceptions, but even Apple devices and various flavors of Linux are not custom-built operating systems that no one has ever heard of. That means many end-user issues are not unique to any one company. What is unique is the company-specific knowledge.

 

What are the IP addresses of the domain controllers? Where is the installation file for the billing software kept? Does the new branch office use a Windows DHCP server or is it running off their core switch?

 

Having a centralized repository of information is priceless to a helpdesk technician. Better yet is when the repository is also the ticket management system, and even better yet is when it also contains documentation for how to solve recurring issues or how to install weird company software.

 

In my first job as a network engineer I worked near the service desk who sat in the next cubicle area. As the number of customers grew, so did the number of technicians, and so did the amount of information needed to resolve tickets. We used a great ticket management system and kept as much information as possible in it. We also used an internal wiki page, but in order to get to it you had to follow a link embedded in the ticketing system.

 

They were able to support several thousand end-users with a help desk of only three technicians and one service desk manager. So important were these principles that if it anyone discovered that information wasn’t in the database that should have been, whoever was responsible to get it in there had to bring in donuts for the entire office. Yes, I brought donuts in a couple times, and so did our service desk manager and even the owner of the company.

 

There are volumes that can be written on how to provide successful end-user support. These three principles may be broad, and I’ve seen them implemented in very different ways. However, so long as you have dedicated people, clear processes, and an updated information repository, the help desk will be the successful first line of defense every CIO and Director of IT dreams of. 

 

 

 

 

 

Sitting back at the office getting work done, keeping the ship afloat, living the Ops life of the perception of DevOps, only to have your IT Director, VP or CxO come and demand, “Why aren’t we using Containers! It’s all the rage at FillInTheBlankCon!” And they start spouting off the Container of the week, Kubernetes, Mesos, Docker, CoreOS, Rocket, Photon, Marathon, and another endless Container product, accessory or component of a container.   If it hasn’t happened to you, that may be a future you’ll be looking at.  If it has happened to you, or you’ve already adopted some approach to Containers in your environment even more the better.

 

Screen Shot 2016-11-16 at 8.59.51 PM.png

Just as a VERY brief primer in the infinite world of containers for those of you who are not aware I’ll try to overly simplify it here.  Using the following image as an example and comparing it to Virtualization. Typically, Virtualization is hardware running a hypervisor to which you abstract of the hardware and install an Operating System on the VM and then install your applications into that. Whereas in most container scenarios you have hardware, running some kind of abstraction layer which you present Containers where you install your applications, abstracting out the Operating System.  

 

So quite possibly the most overly simplified version of it because there are MANY moving parts under the covers to make this a reality and make it possible.  However, who cares how it works as much as how you can use it to improve your environment, right?!

 

That’s kind of the key of things, Docker one of the more commonly known Container approaches (albeit technically Kubernetes is used more) has some real cool benefits and features of it. Docker officially has support for running Docker Containers on Microsoft Servers, Azure and AWS, and they also released Docker for Windows clients and OSX!   One particular benefit there that I like as a VMware user ishttp://www.virtuallyghetto.com/2016/10/powercli-core-is-now-available-on-docker-hub.htmlPowerCLI Core is now available on Docker Hub!

But they don’t really care about how you’re going to use it, because all roads lead to DevOps and how you’re supposed to implement things to make their lives better. But in the event that you will be forced down a road of learning a particular Container approach for better or worse it’s probably best to find a way to make it better your life rather than just another piece of infrastructure we’re expected to understand even if we don’t.   I’m not saying that one Container is better than another, I’ll leave that up to you guys to make that particular determination in the comments if you have Container stories to share.   Though I’m particular to Kubernetes when it comes to running cloud services on Google, but then I really like Docker when it comes to running Docker for OSX (because I run OSX )

The applications are endless and continually growing and the solutions are plentiful, some might say far too plentiful depending.   What are some of the experiences you’ve had with containers, the good the bad and the ugly, or is it an entirely new road you’re looking at pursuing but haven’t yet?  We’re definitely in the no judgement zone!

As always I appreciate your insight into how ya’ll use these technologies to better yourselves and your organizations as we all grow together!

 

Love or hate it, Office 365 is here to stay. For most companies that have not made the transition to the Office 365 yet, I say yet because it is a matter of time before the majority of business are running email in the cloud. When you are planning to move email to the cloud there are many considerations to make and one of them is your network.

 

Your network is key when you want to live in the cloud. One of the first things you should do after you’ve made decision to go to the cloud is review your network and estimate how much bandwidth you will use. Office 365 adds increased usage because of the synchronization outlook and downloading of templates. The amount of users connecting to the cloud and type of tasks they do will impact your bandwidth. Network performance is impacted by what the users are doing, for instance if everyone is streaming video or having multiple video conference calls on your network that will certainly cause high bandwidth which can impact your connectivity to cloud services.

 

Migrating to Office is not an overnight task as some may think. It can take week to months to be completely migrated to the cloud and a lot of this depends on your network. It is highly recommended to test and validate your internet bandwidth as this will impact your migration. Mailbox sizes will impact how fast or slow the migration to the cloud will be. Let’s say your organization has about 100TB emails in your on-premise environment and you want to migrate all that to the cloud. I will tell you it will not help happen in days it will be more like months. Keep in mind Microsoft does throttle how much date you pump into their network each night. Let’s just say you are using an internet connection of 100Mbit/s and you are at 100% speed you are looking at least 8-9 months but given that there is throttling involved and possibly other outside factors that would also effect the speed and bandwidth your real estimate would likely be closer to 10-12 months.

 

Slow internet means slow migration and possible failures along the way.  If your still have slow MPLS sites these are not ideal for Office 365, however Microsoft has partner with a select few providers to use the ExpressRoute. ExpressRoute is a private connectivity to Microsoft Office 365.  Microsoft has some tools that you can use to help estimate your network requirements. One of the important ones to look at is the Exchange Client Network Bandwidth Calculator which estimates the bandwidth required for Outlook, Outlook Web App, and mobile devices.

 

Once you have made it to the cloud it does not stop there. Ongoing performance tuning maybe needed to ensure that your users are happy and do not experience email “slowness”.  Given that Microsoft has published best practices articles on slow networks for Office 365 I am pretty sure your network guys will be called a lot to check network performance. They do give some recommendations such as:

 

  • Upgrade to Outlook 2013 SP1 or later for substantial performance improvements over previous versions.
  • Outlook Web App lets you create offline messages, contacts, and calendar events that are uploaded when OWA is next able to connect to Office 365.
  • Outlook also offers an offline mode. To use this, you must first set up cached mode so that information from your account is copied down to your computer. In offline mode, Outlook will try to connect using the send and receive settings, or when you manually set it to work online.
  • If you have a smart phone, you can use it to triage your email and calendar over your phone carrier's network. ( yes this as a real alternative…)

 

At the end of the day it comes to making sure your network is up to snuff when you are making your way to the cloud or you may have some headaches. Good Luck

Though we can sit around and talk about the threat of Skynet (as we have a little in comments on my previous posts) it seems the tech world is committed to the pursuit and enhancement of artificial intelligence. In fact, you’re almost not a leading tech company right now if I google your name + artificial intelligence and I get zero results. AI startups are also in hot demand. So what exactly are the technology leaders planning?

 

Microsoft – CEO Satya Nadella doesn’t keep it a secret that AI is key for his company “AI is at the intersection of our ambitions.” Even with the socially failed Tay chatbot experiment, Microsoft learnt that at least in the USA, chatbots would need to be built to be resilient to attacks. Most recently, Microsoft announced a partnership to support Elon Musk & Co’s OpenAI non-profit AI research organization with Azure computing power. Microsoft is staying true to its corporate mission, democratizing AI so it’s accessible for every person and every organization on the planet to help them achieve more.

 

Google: Google’s research division has been hard at work on Machine Intelligence for years, boasting 623 publications to-date in their library (that they are happy to publicly share).  Parent company Alphabet boasts the neural network company Deep Mind in its collection, acquired in 2014. Within the last few days, Google have added Jia Li (head of research at Snapchat) and Fei-Fei-Li (director of the AI lab at Stanford University) to lead a new group with the Google Cloud division.

 

Facebook: They’ve got access to your data and already have a reputation for serving you with targeted information. Facebook is focusing on how to scale, to deliver promises like “We’re trying to build more than 1.5 billion AI agents—one for every person who uses Facebook or any of its products.” Joaquin Candela, the head of the Applied Machine Learning group wins the award for my favorite AI quote though “We tend to take things like storage, networking, and compute for granted,” he says. “When the video teams builds live videos, people don’t realize the magnitude of this thing. It’s insane. The infrastructure team is just there, shipping magic—making the impossible possible. We need to do the same with AI. We need to make AI be so completely part of our engineering fabric that you take it for granted.” As an infrastructure junkie, I like anyone who calls my work ‘magic’.

 

Apple: Jumping on the AI bandwagon, Apple is kind of sad that their AI is so unobtrusive that people don’t even realize Apple’s in the AI game. And we’re not just talking about Siri.  Apple doesn’t have a dedicated machine learning department, but the capability underpins a lot of their product capabilities. They are certainly quieter than other brands about what they are working on behind the scenes. One interesting development is the enhancement of image processing software using AI, so physical lens hardware will no longer be the defining factor in camera capability.

 

Cisco: Not wanting to miss out either, Cisco have developed their own virtual assistance called Monica. I’d never heard of her except for the girl from Friends, but it’s been a few years since I touched a corporate telepresence system. Restricted to the office right now, Cisco has plans to increase Monica’s usefulness. It could be handy to say ‘Monica, find me the PowerPoint that Jo presentation last Thursday’. Back at its core business, Cisco has also been smart with AI company acquisitions, snapping up Cognitive Security who use AI techniques to detect advanced cyber threats.

 

IBM: The granddaddy of AI, IBM’s Watson super computer has grown up a little from winning games of Jeopardy. At CES 2016, IBM CEO Ginny Rometty unveiled strategic partnerships with sportswear maker Under Armour, Softbank Robotics’ Pepper and more.  IBM’s Cognitive Business Solutions unit is banking on AI as the future of business, with smarts like “When people ask how Watson is different than a search engine, I tell them to go on Google and type 'anything that's not an elephant.' What do you get? Tons of pictures of elephants. But Watson knows those subtle differences. It understands that when feet and noses run, those are very different things”

 

Elon Musk: To put this paragraph under just the title of Tesla would do the man a disservice. Musk and his business partners formed the OpenAI research company to stop AI from ruining the world.  

 

There you have it. The tech giants are determined to make this happen, whether we like it or not...

 

-SCuffy

Back home after a week in Redmond for the annual Microsoft MVP Summit. It was weird being there for the days before and after the election. In a way I felt as if we were in a bubble, focused on the MVP sessions all day long and isolated from what was happening elsewhere.

 

It was awesome.

 

Here's a bunch of links I found on the Intertubz that you may find interesting, enjoy!

 

Smart Light Bulb Worm Hops from Lamp to Lamp

As if I needed yet another thing to worry about, now light bulbs may be attacking me. I'm starting to think wearing a hat made from aluminum foil might be a smart fashion choice.

 

How to avoid becoming a part of a DDoS attack?

I suppose I could go live in a shack in Montana, but this list might be worth trying first.

 

More with SQL Server on Linux

Words cannot express how excited I am for SQL Server on Linux. Well, maybe these words: sudo yum install -y mssql-server

 

Employees are faster and more creative when solving other people's problems

Well, this possibly explains why some people feel the need to act as if they are the smartest person in the room, but I'm still going to just think they are being a jerk.

 

Because Every Country Is the Best at Something

How is it that Brazil is not the best at Brazil nuts?

 

The Best Ways to Get Rid of Unwanted Data

Setting aside the philosophical discussion about how information can neither be created nor destroyed, these are good tips for those of us that sometimes have the need to make an effort to destroy data.

 

By 2020, 92 Percent of All Data Center Traffic Will be Cloud

It's rare for me to find someone these days that still deny the Cloud is a thing, and I suspect that such perceptions will all but disappear by 2020.

 

Found outside my hotel room last Wednesday morning and my first thought was "is this where the Hadoop sessions are taking place?":

gop-hadoop - 1.jpg

The View From Above

Being a CEO is not the easy job some people think it is. As a CEO I'm pulled in multiple directions and have to do my best to balance the needs of the business with the needs of the shareholders, deal with crises as they arise, and reassure both our investors and our employees that the company is strong and has a positive future. All of this gets a bit tricky when our websites -- the place where we make 60% of our revenue by the way -- keep on going down and we lose sales. For the technical teams the problem ends when the website starts working again, but for me the ripples from each outage keep spreading for months by way of missed revenue targets, the impact on our supply chain as our order volume fluctuates, and the requests for interviews from analysts who are concerned that will not make the numbers we anticipated if we our customers can't buy things. A single big outage can make my life a misery for weeks on end, so it's not surprising, perhaps, that I am less than impressed with our network; or the “NOTwork” as I have come to know it over the last six months. You know, my home network stays up for months on end without interruption, so with all the money we spend on equipment and employees I'd hope we could do the same, but apparently I'm wrong. If we don't fix this soon, I'm just going to instruct the CTO to move everything to the cloud and we'll dump those useless network idiots. I fired the Senior Network Manager last month, and not a moment too soon if you ask me. I only hope his replacement is better than he was; I'd like to get a good night's sleep for once in a while, without worrying about whether the stock price will be plunging tomorrow.

The View From The Trenches

My first few weeks as the Senior Network Manager have been, well, challenging to say the least. My predecessor, Paul, was fired after a shouting match between him and the CEO because of yet another major outage that was being blamed on the network. After our websites had been down for two hours, James (our CEO) stormed down and practically dragged Paul into a conference room and slammed the door behind him. So while I wasn't actually in the room when the showdown occurred, that didn't make much difference because even through the closed door there was no mistaking who the CEO felt was responsible, despite Paul's protestations to the contrary.

Three weeks later we're still not entirely sure how that outage started, and worse, we have no idea how it finally ended. Of course, it may be that somebody actually does know, but doesn't want to admit being the culprit, especially after hearing James losing his mind in that room. The next day, Paul didn't come to work and I was called into my VP's office where I was given the news that I was being promoted to his position effective immediately. Did you ever get a gift you weren't sure you really wanted? Yeah; that.

So, now that you're caught up on how I got into this mess, I need to get back to figuring out what seems to be wrong with the network. I always thought Paul had his finger on the pulse of the network, but once I started spending more time looking at the network management systems, I began to wonder how he figured anything out. It seems that our “availability monitoring” was being accomplished by a folder full of home-grown perl and shell scripts which pinged the network equipment in the data center and would send an email to Paul when a device became unavailable. I mean, that sort of worked, but the scripts weren't logging anything so there was no historical data we could use to calculate uptime. Plus, the ping response could take up to a second to respond before it would time out and be considered a failure, so even if the network or device performance was completely terrible, nobody would have known about it. What I realized is that when Paul was proudly telling to the Board that the network had “four-nines uptime”, he must have been pulling that figure out of the air. I can't believe he got away with it for so long. He might have been right but neither he nor I could prove it, and I refuse to lie about it now that my neck is on the line.

First order of business, then, was to get some proper network management in place. I didn't inherit a huge budget and I was in a hurry, so I used my corporate Amex to grab a copy of Solarwinds NPM. At least now I'm gathering some real data to work with and if (when!) the next outage occurs, maybe I'll see something happening that will give me a clue what's going on. The executive team has finally put a woman in charge of the network, and I'm going to show them just what I'm capable of.

 

To Be Continued...

Filter Blog

By date:
By tag: