Skip navigation
1 2 Previous Next

Geek Speak

24 Posts authored by: cxi

Screen Shot 2017-06-20 at 12.47.16 AM.png

 

IT professionals are a hardworking group. We carry a lot of weight on our shoulders, a testament to our past and future successes. Yet, sometimes we have to distribute that weight evenly across the backs of others. No, this is not because we don’t want to do something. I’m sure that any of you, while capable of performing a task, would never ask another person to do something you wouldn’t willingly do yourself. No. Delegating activities to someone else is actually something we all struggle with.

 

Trust is a huge part of delegating. You're not only passing the baton of what needs to be done to someone else, but you’re also trusting that they’ll do it as well as you would, as quickly as you would, and -- this is the hard part -- that they'll actually do it.

 

As the world continues to evolve, transition, and hybridize, we are faced with this challenge more often. I’ve found there are some cases where delegation works REALLY well, and other cases where I’ve found myself banging my head against the wall, desk, spiked mace, etc. You know the drill.

 

One particular success story that comes to mind involves the adoption of Office 365. Wow! My internal support staff jumped for joy the day that was adopted. They went from having to deal with weird, awkward, and ridiculous Exchange or Windows server problems on a regular basis to... crickets. Sure, there were and still are some things that have to be dealt with, but it went from daily activity to monthly activity. Obviously, any full-time Exchange admin doesn't want to be replaced by Robot365, but if it's just a small portion of your administrative burden that regularly overwhelms, it's a good bet that delegating is a good idea. In this particular use-case, trust and delegation led to great success.

 

On the other hand, I’ve seen catastrophes rivaled only by the setting of a forest fire just for the experience of putting it out. I won’t name names, but I've had rather lengthy conversations with executives from several cloud service providers we all know and (possibly) love. Because I’m discussing trust and delegation, let’s briefly talk about what we end up trusting and delegating in clouds.

 

  • I trust that you won’t deprecate the binaries, libraries, and capabilities that you offer me
  • I trust that you won’t just up and change the features that I use and my business depends on
  • I trust that when I call and open a support case, you’ll delegate activities responsibly and provide me with regular updates, especially if the ticket is a P1

 

This is where delegating responsibility and trusting someone to act in your best interest versus the interests of themselves or some greater need beyond you can be eye-opening.

 

I’m not saying that all cloud service providers are actively seeking to ruin our lives, but if you talk to some of the folks I do and hear their stories, THEY might be the one to say that. This frightful tale is less about the fear and doubt of what providers will offer you, and more about being aware and educated about the things that could possibly happen, especially if you aren’t fully aware of the bad things that happen on the regular.

 

In terms of trust and delegation, cloud services should provide you with the following guarantees:

  • Trust that they will do EXACTLY what they say they will do, and nothing less. Make sure you are hearing contractual language around that guarantee versus marketing speak. Marketing messages can change, but contracts last until they expire.
  • Trust that things DO and WILL change, so be aware of any depreciation schedules, downtime activities, impacts, overlaps of changes, and dependencies that may lie within your business.
  • Delegate to cloud services only those tasks and support that may not matter to your production business applications. You want to gauge how well they can perform and conform to an SLA. It’s better to be disappointed early on when things don’t matter than to be in a fire-fight and go looking for support that may never come to fruition.

 

This shouldn't be read as an attack or assault on cloud services. Instead, view this as being more about enlightenment. If we don’t help make them better support organizations, they won’t know to and will not improve. They currently function on a build-it-and-they-will-come support model, and if we don’t demand quality support, they have no incentive to give it to us.

 

Wow! I went from an OMG Happy365 scenario to cloudy downer!

 

But what about you? What kinds of experiences with trust and delegation have you had? Successes? Failures? I’ll offer up some more of my failures in the comments if you’re interested. I would love to hear your stories, especially if you've had contrary experiences with cloud service providers. Have they gone to bat for you, or left you longing for more?

cat-hiding-2872.jpg

 

 

 

 

Hey, guys! This week I’d like to share a very recent experience. I was troubleshooting, and the information I was receiving was great, but it was the context that saved the day! What I want to share is similar to the content in my previous post, Root Cause, When You're Neither the Root nor the Cause, but different enough that I thought I'd pass it along.

 

This tale of woe begins as they all do, with a relatively obscure description of the problem and little foundational evidence. In this particular case it was, “The internet wasn't working on the wireless, but once we rebooted, it worked fine.” How many of us have had to deal with that kind of problem before? Obviously, all answers lead to, “Just reboot and it’ll be fine." While that’s all fine and dandy, it is not acceptable, especially at the enterprise level, because it offers no real solution. Therefore, the digging began.

 

The first step was to figure out if I could reproduce the problem.

 

I had heard that it happened with some arbitrary mobile device, so I set up shop with my MacBook, an iPad, my iPhone and my Surface tablet. Once I was all connected, I started streaming content, particularly the live YouTube stream of The Earth From Space. It had mild audio and continuous video streaming that could not buffer much or for long.

 

The strangest thing happened in this initial wave of troubleshooting. I was able to REPRODUCE THE PROBLEM! That frankly was pretty awesome. I mean, who could ask for more than the ability to reproduce a problem! Though the symptoms were some of the stranger parts, if you want to play along at home, maybe you can try to solve this as I go. Feel free to chime in with something like, “Ha ha! You didn’t know that?" It's okay. I’m all for a resolution.

 

The weirdest part of this resolution was that for devices connecting on lower wireless bands, 802.11A, 802.11N, things were working like a champ, or seemingly working like a champ. They didn’t skip a beat and were working perfectly fine. I was able to reproduce it best with the MacBook connected at 802.11AC with the highest speeds available. But seemingly, when it would transfer from one APS channel to another AP on another channel, poof, I would lose internet access for five minutes. Later, it was proven to be EXACTLY five minutes (hint).

 

At the time though, like any problem in need of troubleshooting, there were other issues I needed to resolve because they could have been symptoms of this problem. Support even noted that these symptoms relate to a particular problem that was all fine and dandy when adjusted in the direction I preferred.  Alas, they didn’t solve my overwhelming problem of, “Sometimes, I lose the internet for EXACTLY five minutes.” Strange, right?

 

So, I tuned up channel overlap, modified how frequent devices will roam to a new access point and find their new neighbor, cleaned up how much interference there was in the area, and got it working like a dream. I could walk through zones transferring from AP to AP over and over again, and life seemed like it was going great. But then, poof, it happened again. The problem would resurface, with its signature registering an EXACT five-minute timeout.

 

This is one of those situations where others might say, “Hey, did you check the logs?” That's the strange part. This problem was not in the logs. This problem transcended mere logs.

 

It wasn’t until I was having a conversation one day and said, “It’s the weirdest thing. The connection with a full wireless signal, with minimal to no interference and nothing erroneous showing in the logs would just die, for exactly five minutes.” My friend chimed in, “I experienced something similar once at an industrial yard. The problem would surface when transferring from one closet-stack to another closet-stack, and the tables for Mac Refresh were set to five minutes. You could shorten the Mac Refresh timeout, or simply tunnel these particular connections back to the controller."

 

That prompted an A-ha moment (not the band) and I realized, "OMG! That is exactly it." And it made sense. In the earlier phases of troubleshooting, I had noted that this was a condition of the problem occurring, but I had not put all of my stock in that because I had other things to resolve that seemed out of place. It’s not like I didn’t lean on first instincts, but it’s like when there’s a leak in a flooded basement. You see the flooding and tackle that because it’s a huge issue. THEN you start cleaning up the leak because the leak is easily a hidden signal within the noise.

 

In the end, not only did I take care of the major flooding damage, but I also took care of the leaks. It felt like a good day!

 

What makes this story particularly helpful is that not all answers are to be found within an organization and their tribal knowledge. Sometimes you need to run ideas past others, engineers within the same industry, and even people outside the industry. I can’t tell you the number of times I've talked through some arbitrary PBX problem with family members. Just talking about it out loud and explaining why I did certain things caused the resolution to suddenly jump to the surface.

 

What about you guys? Do you have any stories of woe, sacrifice, or success that made you reach deep within yourself to find an answer? Have you had the experience of answers bubbling to the surface while talking with others? Maybe you have other issues to share, or cat photos to share. That would be cool, too.

I look forward to reading your stories!

Expectation.png

 

Hey, everybody!  Welcome to this week’s quandary of Root Cause, Correlation Analysis, and having to collaborate across cross-functional teams where you have all the hands but none of the fingers!

 

If that sounds confusing to you, it’s because frankly, it is! I’d like to share a tale of woe and heartbreak driven by frustration in functional and equally dysfunctional IT team dynamics!

 

The story is set in a fairly cross-functional organization. You're probably familiar with the type. While there are clearly defined teams with responsibilities, there are also hard lines in the sand of who does what, where, when, how and why. Honestly, this story rings so true that I’ve seen this story blur with other ones. If that isn’t excitement, I don’t know what is!

 

As the story goes, our team had deployed a series of tools enabling a cross-stack data correlation engine allowing us to identify and truly correlate events as they happen to allow troubleshooting to be better, easier.   The problem was the true burden of responsibility this team had ALL the responsibility of identifying problems, but none of the authority to actually resolve those problems, let alone the authorization to work on them!   What makes this particularly fun is that we were chartered with and burdened by the responsibility of being held accountable for the issues until they were resolved.   If that sounds like some kind of decision made in a government sector… I wouldn’t tell you you’re wrong! J

 

This is where simple technical skills while essential were not good enough.  And frankly, all of the project management skills in the world wouldn’t matter here, because it’s not like a “problem” is a “project” per se.   No, we had to get everyone on board, every stakeholder at the table where egos were strong and stubborn.   Just like we discussed recently in Better Together - Working Together in Silo Organizations and When Being an Expert Isn’t Good Enough: Master of All Trades, Jack of None merely knowing the answer or the cause of the problem wasn’t good enough here.   All parties would reject the issue being theirs, even in light of evidence proving otherwise and would instead resort to finger pointing. Fortunately how we started to navigate these waters was through education of the tools we were using and how it would provide insight into their systems, access to our tools so we weren’t just the messenger they were trying to shoot but a helpful informant in things, and we also offered our guidance as IT Professionals to help them navigate the errors or problems so they could resolve them better.

 

It sounds so simple, it’s something fairly straight-forward but the timing it took and would continue to take whenever new members would join a team, or new problems would surface would take months or longer to reach a sense of team parity.

 

It’s been an interesting element of Systems Operations in the face of having intelligence, and knowledge not meaning much of anything unless you had all parties engaged, and even then that was no guarantee that people would agree, let alone do anything about it.

 

Have you faced a similar issue as well, where you identify a problem which isn’t your problem and the challenges faced in trying to resolve it?  Or perhaps even just having accountability for something which isn’t your responsibility and the woes of trying to get parties to take responsibility?

 

Or really any other story of problem correlation and root cause and how you were able to better or faster resolve it than what we faced!

In my last post WHEN BEING AN EXPERT ISN’T GOOD ENOUGH: MASTER OF ALL TRADES, JACK OF NONE, you all shared some great insight on how you were able to be find ways to be successful as individual SMEs and contributors, and how you could navigate the landscape of an organization.  

 

This week, I’d like to talk about silo organizations and how we’ve found ways to work better together. (You can share your stories, as well!)

 

 

This is the first thing I imagine when I hear that an organization is silo-ed off:

oldsilos.jpg

 

The boundaries are clearly defined, the foundation is well set, it’s very aged and well established. It doesn’t mean any of it is particularly good or bad, but it certainly shows the test of time. Navigating in that landscape requires more than tackling a delicate balance of ego and seniority.

 

Once upon a time, we had a very delicate situation we were trying to tackle. This may sound simple and straightforward, but needless to say, it’ll all make sense on how things were far from easy. We were faced with deploying a syslog server. Things literally do NOT get any easier than that! When I first found out about this (security) initiative, I was told that this was a "work in progress" for over two years, and that no syslog servers had been deployed, yet. Wait. Two years? Syslog server. None deployed?! This can’t be that difficult, can it? Welcome to the silo-ed organization, right?

 

On its surface, it sounds so simple, yet as we started to peel back the onion:

 

Security needed syslog servers deployed.

The storage team would need to provision the capacity for these servers.

The virtualization team would need to deploy the servers.

The networking team would need to provide IP addresses, and the appropriate VLANs, and advertise the VLANs as appropriate if they did not exist.

The virtualization team would then need to configure those VLANs in their networking stack for use.

Once all that was accomplished, the networking and security teams would need to work together to configure devices to send syslog data to these servers.

 

All that is straightforward, and easy to do when everyone works together! The disconnected, non-communicating silos prevented that from happening for years because everyone felt everyone else was responsible for every action and it’s a lot easier to not do things than to work together!

 

Strangely, what probably helped drive this success the most was less the clear separation of silo-by-silo boundary and more the responsibility taken by project managing this as a single project. When things are done within a silo, they’re often done in a bubble and begin and end without notifying others outside of that bubble. It makes sense, like when driving a car we’re all driving on the same road together and our actions may influence each other’s (lane changes, signal changes, and the like), but what music I’m listening to in my car has no influence on any other car.  

 

So, while we all have our own interdependencies that exist within our silos, when we’re working together ACROSS silos on a shared objective, we can be successful together as long as we recognize the big picture.   Whether we recognize that individually, or we do collectively with some dictated charter, we can still be successful. When I started this piece, I was more focused on the effects and influence we can make as individuals within our silos, and the interaction and interoperability with others within silos. But I came to realize that when we each individually manage our responsibilities within a “project,” we become better together. That said, I'm not implying that formal project management is required for any or all multi-silo interactions. It really comes down to accepting responsibility as individuals, and working together on something larger than ourselves and our organization, not just seeing our actions as a transaction with no effect on the bigger whole.

 

Then again, I could be crazy and this story may not resonate with any of you.   

 

Share your input on what you’ve found helps you work better together, whether it be inter-silo, intra-silo, farming silos, you name it!

Has this situation happened to you? You've dedicated your professional career -- and let's be honest -- your life, on a subject, only to find “that's not good enough.” Maybe it comes from having too many irons in the fire, or it could be that there are just too many fires to be chasing.

 

Ericsson (1990) says that it takes 10,000 hours (20 hours for 50 weeks a year for ten years = 10,000) of deliberate practice to become an expert in almost anything.

 

I’m sure you’ve heard that Ericsson figure before, but in any normal field, the expectation is that you will gain and garner that expertise over the course of 10 years. How many of you can attest to spending 20 hours a day for multiple days to even multiple weeks in a row as you tackle whatever catastrophe the business demands, often driven by a lack of planning on their part? (Apparently, a lack of planning IS our emergency when it comes to keeping that paycheck coming in!)

 

I got my start way back in Security and Development (the latter of which I won’t admit if you ask me to code anything :)). As time progressed, the basic underpinnings of security began delving into other spaces. The message became, “If you want to do ANYTHING in security, you need networking skills or you won’t get very far.” To understand the systems you’re working on, you have to have a firm grasp of the underlying Operating Systems and kernels. But if you’re doing that, you better understand the applications. Oh, and in the late 1990s, VMware came out, which made performing most of this significantly easier and more scalable. Meanwhile, understanding what and how people do the things they do only made sense if you understood System Operations. And nearly every task along the way wasn’t a casual few hours here or there, especially if your goal was to immerse yourself in something to truly understand it. Doing so would quickly become a way of life, and before long you'd quickly find yourself striving for and achieving expertise in far too many areas, updating your skill sets along the way.

 

As my career moved on, I found there to be far more overlap of specializations and subject matter expertise, rather than clearly delineated silos. Where this would come to head as a strong positive was when I worked with organizations as a SME in storage, virtualization, networking and security, finding that the larger the organization, the more these groups would refuse to talk to each other. More specifically, if there was a problem, the normal workflow or blame assignment would look something like this picture. Feel free to provide your own version of events that you experience.

 

 

Given this very atypical approach to support by finger-pointing, having expertise in multiple domains would become a strong asset since security people will only talk to other security people. Okay, not always, but also, yes, very much always. And if you understand what they’re saying and where they’re coming from, pointing out, “Hey, do you have a firewall here?” means a lot more coming from someone who understands policy than from one of the other silos, which they seemingly have nothing but disdain for. Often, a simple network question posed by one network person to another could move mountains, because each party respects the ability or premise of the other. Storage and virtualization folks typically take the brunt of the damage because they regularly have to prove that problems aren’t their fault because they’re the easiest point of blame due to storage pool consolidation or hardware pool consolidation. Finally, the application guys simply won’t talk to us half the time, let alone mention that they made countless changes without understanding what WE did wrong to make their application suddenly stop working the way it should. (Spoiler alert: It was an application problem.)

 

Have you found yourself pursuing one or more subject matter domains of expertise, either just get your job done, or to navigate the shark-infested waters of office politics? Share your stories!

Wow, can you believe it? 2016 is almost over, the holidays are here I didn’t even get you anything!   It’s been a bit of a wild rollercoaster of a year through consolidation, commoditization, and collaboration!

 

I’m sure you have some absolute favorite trends or notable things which have occurred here throughout 2016.  Here are some that in particular have been a pretty recurring trend throughout the year.

 

 

  • Companies going private such as Solarwinds (closed in February), DellEMC (closed in September)
  • Companies buying other companies and consolidating industry like Avago buying Broadcom (Closed Q1), Brocade buying Ruckus (Closed Q3), Broadcom buying Brocade (Initiated in October)
  • Or companies divesting of assets like Dell selling off SonicWall and Quest, and Broadcom selling off Brocade’s IP division

 

 

Alright so that’s some of the rollercoaster at least a small snapshot of it, and the impact those decisions will have on practitioners like you and I only time will tell (I promise some of those will be GREAT and some of those, not so much!)

 

But what else, what else?! Some items I’ve very recently discussed include.

 

 

All three of these net-net benefit in the end really means that we will continue to see better technology, with deeper investment and ultimately (potentially) lower costs!

 

On the subject of Flash though if you haven’t been tracking the Density profiles have been insane this year alone and that trend is only continuing with further adoption and better price economics with technology like NVMe.  I particularly love this image as it reflects the shrinking footprint of the data center while reflecting our inevitable need for more.

 

Moores Law of Storage.png

 

 

This is hardly everything that happened in 2016 but these are particular items which are close to my heart and respectively my infrastructure.   I will give a hearty congratulation to this being the 16th official “year of vdi” a title we continue to grant it yet continues to fail to fulfill on its promises.  

 

Though with 2016 closing quickly on our heels there are a few areas you’ll want to be on the watch for in 2017!

 

  • Look for Flash Storage to get even cheaper, and even denser
  • Look to see even more competition in the Cloud space from Microsoft Azure, Amazon AWS and Google GCP
  • Look to Containers to become something you MIGHT actually use on a regular basis and more rationally than the very obscure use-cases promoted within organizations
  • Look to vendors to provide more of their applications and objects as Containers (EMC did this with their ESRS (Secure Remote Support)
  • Obviously 2017 WILL be the Year of VDI… so be sure to bake a cake
  • And strangely with the exception of pricing economics making adoption of 10GigE+ and Wireless wave2 we’ll see a lot more of the same as we saw this year, maybe even some retraction in hardware innovation
  • Oh and don’t forget, more automation, more DevOps, more “better, easier, smarter”

 

But enough about me and my predictions, what were some of your favorite and notable trends of 2016 and what are you looking to see coming forward looking to 2017?

 

And if I don’t get a chance to… Happy Holidays and a Happy New Year to ya’ll!

Well hey everybody, I hope the Thanksgiving holiday was kind to all of you. I had originally planned to discuss more DevOPS with ya’ll this week however a more pressing matter came to mind in my sick and weakened state of stomach flu!

 

Lately we’ve been discussing ransomware but more important, lately I’ve been seeing an even greater incidence of ransomware affecting individuals and businesses, and worse when it would hit a business it would have a lot of collateral damage (akin to encrypting the finance share that only cursory access was allowed to or such)

 

KnowBe4 has a pretty decent Infographic on Ransomware I’m tossing in here and I’m curious what ya’ll have been seeing in this regards.

Do you find this to be true, an increased incidence, a decrease, roughly the same?

 

Ransomware-Threat-Survey.jpg

 

Some real hard and fast takeaways I’ve seen from those who aspire to mitigate ransomware attacks is to Implement:

 

  • Stable and sturdy firewalls
  • Email filtering scanning file contents and blocking attachments
  • Comprehensive antivirus on the workstation
  • Protected Antivirus on the servers

 

Yet all too often I see all of this investment around trying to ‘stop’ it from happening without a whole lot left to handling clean-up should it hit the environment, basically… Having some kind of backup/restore mechanism to restore files SHOULD you be infected.

 

Some of the top ways I’ve personally seen where Ransomware has wrought havoc in an environment have happened in the cases of; 

  • Using a work laptop on an untrusted wireless network
  • Phishing / Ransomware emails which have links instead of files and opening those links
  • Opening a “trusted” file off-net and then having it infect the environment when connected
  • Zero Day Malware through Java/JavaScript/Flash/Wordpress hacks (etc)

 

As IT Practitioners not only do we have to do our daily jobs, and the business to keep the lights on, and focus on innovating the environment, and keeping up with the needs of the business.   Worst of all when things go bad, and few things are as bad as Ransomware attacking and targeting an environment, then we have to deal with that on a massive scale! Maybe we’re lucky and we DO have backups, and we DO have file redirect so we can restore off of a VSS job, and we can detect encryption in flight and stop things from taking effect.   But that’s a lot of “Maybe” from end-to-end in any business and all of the applicable home devices that may be in play.  

 

There was a time when Viruses would break out in a network and require time and effort to cleanup, but at best it was a minor annoyance.  Worms would breakout and so long as we stopped whatever was the zero-day trigger we could stop it from occurring on the regular.   And while APTs and the like are more targeted threats this was less of a common occurrence for us to deal with where it would occupy our days as a whole.   But Ransomware gave thieves a way to monetize their activities, which gives incentives to infiltrate and infect our networks.   I’m sure you’ve seen the Ransomware now offering Helpdesk to assist victims with paying?

 

 

It’s definitely a crazy world we live in, one which leaves us only with more work to do on a daily basis, a constant effort to fend off and fight against.  This is a threat which has been growing at constant pace and is leaking and growing to infect Windows, Mac AND Linux.

 

What about your experiences, do you have any attack vectors for Ransomware you’d like to share, or other ways you were able to fend them off?  

Sitting back at the office getting work done, keeping the ship afloat, living the Ops life of the perception of DevOps, only to have your IT Director, VP or CxO come and demand, “Why aren’t we using Containers! It’s all the rage at FillInTheBlankCon!” And they start spouting off the Container of the week, Kubernetes, Mesos, Docker, CoreOS, Rocket, Photon, Marathon, and another endless Container product, accessory or component of a container.   If it hasn’t happened to you, that may be a future you’ll be looking at.  If it has happened to you, or you’ve already adopted some approach to Containers in your environment even more the better.

 

Screen Shot 2016-11-16 at 8.59.51 PM.png

Just as a VERY brief primer in the infinite world of containers for those of you who are not aware I’ll try to overly simplify it here.  Using the following image as an example and comparing it to Virtualization. Typically, Virtualization is hardware running a hypervisor to which you abstract of the hardware and install an Operating System on the VM and then install your applications into that. Whereas in most container scenarios you have hardware, running some kind of abstraction layer which you present Containers where you install your applications, abstracting out the Operating System.  

 

So quite possibly the most overly simplified version of it because there are MANY moving parts under the covers to make this a reality and make it possible.  However, who cares how it works as much as how you can use it to improve your environment, right?!

 

That’s kind of the key of things, Docker one of the more commonly known Container approaches (albeit technically Kubernetes is used more) has some real cool benefits and features of it. Docker officially has support for running Docker Containers on Microsoft Servers, Azure and AWS, and they also released Docker for Windows clients and OSX!   One particular benefit there that I like as a VMware user ishttp://www.virtuallyghetto.com/2016/10/powercli-core-is-now-available-on-docker-hub.htmlPowerCLI Core is now available on Docker Hub!

But they don’t really care about how you’re going to use it, because all roads lead to DevOps and how you’re supposed to implement things to make their lives better. But in the event that you will be forced down a road of learning a particular Container approach for better or worse it’s probably best to find a way to make it better your life rather than just another piece of infrastructure we’re expected to understand even if we don’t.   I’m not saying that one Container is better than another, I’ll leave that up to you guys to make that particular determination in the comments if you have Container stories to share.   Though I’m particular to Kubernetes when it comes to running cloud services on Google, but then I really like Docker when it comes to running Docker for OSX (because I run OSX )

The applications are endless and continually growing and the solutions are plentiful, some might say far too plentiful depending.   What are some of the experiences you’ve had with containers, the good the bad and the ugly, or is it an entirely new road you’re looking at pursuing but haven’t yet?  We’re definitely in the no judgement zone!

As always I appreciate your insight into how ya’ll use these technologies to better yourselves and your organizations as we all grow together!

 

That’s a good question, what do self-driving vehicles have to do with our infrastructure? Is it that it’s untested, untrusted and unproven and could result in death and fear mongering? That’s certainly a true enough statement, though what is the key difference in the distinction of ‘autonomous’ vs merely self-driving vehicles?

Screen Shot 2016-11-09 at 10.36.24 PM.png

Screen Shot 2016-11-09 at 10.33.07 PM.png

 

 

Telemetry is the watch word

 

Today’s networks and systems are a huge ball of data, big data, helpful, insightful, useless and befuddling endless piles of information.   Left up to their own devices that information lives in its respective bubble waiting for us to ‘discover’ a problem and then start peeling back the covers to figure out what is going on.   The Autonomous Self-Driving example vs simply ‘self-driving’ is that you’re using data from many continuous and constant streams, using that data to correlate events and understand conditions.   In its primitive state it can be fairly effective, in a networked sense; Imagine every vehicle on the road communicating with each other, constantly panning and analyzing everything in front of, behind you and here, there and everywhere.   Compound that data with collected information from external sources such as road sensors, lights and other conditions and you have the power of having traffic management be automated (Slap weather stations into each of these vehicles and we get closer to predicting even more accurate weather patterns)

 

But hey, whoa, What about my network? My systems?!

 

More and more we’re continuing to see solutions which are evolved far beyond simply a point solution. SIEMs don’t just collect security and event information in a bubble. Syslogs aren’t just an endless repository of arbitrary strings of ‘event’ information.  SNMP need not live caught in its own trap.

 

There are tools, solutions, frameworks, and suites of tools which aim to bring your NOC and SOC into the future, a future wholly unknown.   There is no true panacea to tie everything together and be the end-all-be-all solution, though as time goes on evolutions and consolidations of products have been starting to make that possible.   There was a time when I ran a massive enterprise we would have ‘point’ tools, which do an amazing job of keeping up on THEIR data and telemetry though they were independent and not even remotely interdependent. Monitoring VMware with vCOPS, Monitoring the network with Orion and NPM, collecting some event data with ArcSight, while separately collecting Syslog information with Kiwi Syslog server, and yet SNMP traps would flow into SNMPc, oh and lets not forget monitoring Microsoft… That’s where System Center came in. 

 

On the one hand that may seem like an excessive amount of overkill, yet each ‘product’ covered and fulfilled its purpose, doing 80% of what it did well, yet in the remaining 20% unable to cover the rest of the spread. (Slight disclaimer, there were some 50+ more tools, those were just the ‘big’ ones that we’ve all likely heard of J)

 

So each of these solutions as they evolve or other products in the industry continue to evolve they’re taking what has effectively been the ‘cruise control’ button in our cars or even slightly better than cruise control and building the ability to provide real data, real analytics, real telemetry so that the network and our systems can work for us and with us, vs being little unique snowflakes that we need to care and feed for and figure out when things go wrong.

 

So what have you been using or looking at to help drive the next generation of infrastructure systems telemetry?   Are you running any Network Packet Brokers, Sophisticated ‘more than SIEM’ like products, or Solarwinds suites to tie many things together, Or has anyone looked at Intel’s open sourced Open Telemetry Framework, SNAP?

 

Please share your experiences!

Cloud Dollars.png

 

The Cloud! The Cloud! Take us to the Cloud it’s cheaper than on-premises, why? Because someone in marketing told me so!  No, but seriously. Cloud is a great fit for a lot of organizations, a lot of applications, a lot of a lot of things! But just spitting ‘Cloud’ into the wind doesn’t make it happen, nor does it always make it a good idea.   But hey, I’m not here to put Cloud down (I believe that’s called Fog) nor am I going to tout it unless it’s a good fit.   However, I will share some experiences, and hopefully you’ll share your own because this has been a particular area of interest lately, at least with me but I’m weird about things like deep tech and cost benefit models.

 

The example I’ll share is one which is particularly dear to my heart. It’s dear because It’s about a Domain Controller!   Domain Controllers are for all intents and purposes, machines which typically MUST remain on at all times, yet don’t necessarily require a large amount of resources.  So when you compare a domain controller running On-Premises let’s say as a Virtual Machine in your infrastructure it carries with it an arbitrary cost aggregated and then taken as a percentage of the cost of your Infrastructure, Licensing, allocated resources, and O&M Maintenance cost for Power/HVAC and other.   So how much does a Domain Controller running as a Virtual Machine run inside your data center? If you were not to say, “It Depends” I might be inclined not to believe you, unless you do detailed charge back for your customers.

 

Yet, we’ve stood up that very same virtual machine inside of Azure, let’s say a standard Single Core, Minimal memory A1-Standard instance to act as our Domain Controller.   Microsoft Azure pricing for our purposes was pretty much on the button, coming in at around ~$65 per month.   Which isn’t too bad, I always like to look at 3 years at a minimum for the sustainable life of a VM just to contrast it to the cost of on-premises assets and depreciation.   So while $65 a month sounds pretty sweet, or ~$2340 over three years I have to also consider other costs which I might not normally be looking at.  Egress network bandwidth, Cost of backup (Let’s say I use Azure backup, that adds another $10 a month, so what’s another $360 for this one VM)

 

The cost benefits can absolutely be there if I am under or over a particular threshold, or if my workloads are historically more sporadic and less ‘always-on, always-running’ kind of services.

An example of this, is we have a workload which normally takes LOTS of resources and LOTS of cores and runs until it finishes.   We don’t have to run it too often (Quarterly) and allocating those resources, obtaining the assets while great, they’re not used every single day.   So we spin up a bunch of Compute or GPU Optimized jobs and when it might have taken days or weeks in the past we can get it done in hours or days, which means we get results and we release the resources once we get our data dumped out.

 

Certain workloads will tend to be more advantageous to others to be kept on-premises or hosted exclusively in the cloud, whether sporadically or all the time.   That really comes down to what matters to you, your IT and your support organization.

 

This is where I’m hoping you my fellow IT Pros can share your experiences (Good, Bad, Ugly) about workloads you have moved to the Clouds, I’m preferable to an Azure, Google or Amazon as they’ve really driven things down to a commoditized goods and battle amongst themselves, whereas an ATT, RackSpace, and other ‘hosted’ facility type cloud can skew the costs or benefits when contrasted to the “Big Three”

 

So what has worked well for you, what have you loved and hated about it. How much has it cost you? Have you done a full shift taking ALL your workload to a particular cloud or Clouds. Have you said ‘no more!’ and taken workloads OFF the Cloud back On-Premises? Share your experiences so that we may all learn!

 

P.S., We had a set of Workloads hosted Off-Premises in Azure which were brought wholly back in house as the high performance yet persistent always-on nature of the workloads was costing 3x-4x more than if we had simply bought the Infrastructure and hosted it internally. (Not every workload will be a winner )

 

Thanks guys and look forward to hearing your stories!

IMG_2055.JPG

 

Automation is the future. Automation is coming. It will eliminate all of our jobs and make us obsolete. Or at least that is the story which is often being told.  Isn’t it true though?

I mean, who remembers these vending machines of the future which were set to eliminate the need for cooks, chefs, cafeterias!   That remarkably sounds just like the same tools we’re using on a regular basis built, designed and streamlined to make us all unnecessary to the business! And then Profit, right?

 

Well, if Automation isn’t intended to make eliminate us, what IS it for?  Some might say that automation is to make the things we’re already doing today easier and make us better at doing our jobs.  That can be true to a point.  Some might also say that automation is taking things that we cannot do today and making it possible so we can be better at doing our jobs. That can also be true to a point.

 

How many of you recall in over the course of your networking operations and management lives, Long before Wireshark and ethereal, having to bring online or hire in a company to help troubleshoot a problem with a “sniffer laptop”.   It wasn’t anything special, and something we all likely take for granted today, yet it was a specialized piece of equipment with specialized software which enabled us to gain insight into our environment, to dig into the weeds to see what is going on!

Screen Shot 2016-10-26 at 10.12.59 PM.png

 

These days though with Network Monitoring tools, SNMP, NetFlow, Sflow, Syslog servers and real time data telemetry from our network devices it is not only something which is attainable, it’s downright expected that we should have all of this information and visibility.

 

With the exception of a specialized ‘sniffer’ engineer, I don’t see that automation having eliminated people, It only made us all more valuable and yet the expectation of what we’re able to derive out of the network has only grown.   This kind of data has made us more intelligent, but it hasn’t exactly made us smarter. The ability to read the Rosetta stone of networking information and interpret it is what has separated the engineers from the administrators in some organizations.   Often times the use of tools has been the key to taking that data and not only making it readable, but also making it actionable. 

 

Automation can rear its beautiful or ugly head in many different incarnations in the business, from making deployment of workstations or servers easier than we had been in the past with software suites, tools or scripting.  To, taking a dated analogy eliminate the need for manual switch board operators at Telcos by replacing them with Switching Stations which automatically transfer calls based upon the characteristics of dialing.  But contrasting the making something we were already doing today and making it better, to the something we were already doing with people and eliminating them.   Until this latest generation and thanks to technology and automation credit companies are able to generate ‘one-time CC #’ which can also be tied back to a very specific amount of money to withdraw from your credit card account. A capability which was not only unheard of in the past, but it would have been fundamentally impossible to implement let alone police without our current generation of data, analytics and automation abilities.

 

 

As this era of explosive knowledge growth, big data analytics and automation continues, what have you been seeing as the big differentiators to what kind of automation is making your job easier or more possible, which aspects of automation are creating capability which fundamentally didn’t exist before, and which parts of it are eliminating partially or wholly the way we do things today?

The other day I was having a conversation with someone new to IT (They had chosen to pursue down the programming track of IT which can be an arduous path for the uninitiated!). The topic of how teaching, education and learning to program came up and I’d like to share this analogy which not only works out with programming but also I’ve found pretty relevant to all aspects of an IT Management ecosystem.

 

The act of programming like any process driven methodology is an iterative series of steps, one where you will learn tasks one day which may ultimately remain relevant to future tasks. And then there are tasks you’ll need to perform which are not only absolutely nothing like you did the previous day, they’re unlike anything you’ve ever seen or imagined in your whole career, whether just starting out or you’ve been banging at the keys all your life. The analogy I chose for this was cooking, I know I know, the relevance hopefully should resonate with you!

 

When you get started out cooking, correct me if I’m wrong but you’re usually not expected to prepare a perfectly rising without falling soufflé, No, not at all.  That would be a poor teaching practice and one where you’re setting up someone for failure.  Instead, let’s start out somewhere simple, like Boiling water. You can mess up boiling water but once you start to understand the basic premise of it you can use it for all kinds of applications! Sterilizing water, cooking pasta or potatoes, the sky is the limit!  Chances are, once you learn how to boil water you’re not going to forget how to do it and perhaps will get even better at doing it, or find even more applications. The same is true systematically that once you start navigating into PowerShell, Bash, Python or basic batch scripts; what you did once you’ll continue to do over and over again because you understand it, but more so you got it down pat!

 

The day will come however you’re asked to do something you didn’t even think about the day prior, no more are you asked to perform a basic PowerShell script to dump the users last login you could whip up before in a single line of code (boiling water) and instead you’re asked to parse in an XLS or CSV file to go make a series of iterative changes throughout your Active Directory (Or for practical use-case sake, Query active directory for workstations which haven’t logged into AD or authenticated in the past 60 days, dump that into a CSV file, compare them against a defined whitelist you have in a separate CSV, as well as omitting specific OUs and then perform a mass ‘disable’ of those computer accounts while also moving them into a temporarily ‘Purge in 30 days’ OU and generate a report to review.) Oh we also want this script to run daily, but it can’t have any errors or impact any production machines which don’t meet these criteria.   Let’s for the sake of argument… Call this our soufflé…

 

Needless to say, that’s a pretty massive undertaking for anyone who was great at scripting, scripting that which they’ve been doing a million times before.   That is what is great about IT and cooking though.   Everything is possible, so long as you have the right ingredients and a recipe you can work off of.   In the given above scenario performing every one of those steps all at once might seem like the moon to you, but you may find that if you can break it down into a series of steps (recipe) and you’re able to perform each of those individually, you’ll find it is much more consumable to solve the bigger problem and tie it all together.

 

What is great about our IT as a community is just that, we are a community of others who have either done other things before, or perhaps have done portions of other things before and are often willing to share.    Double-down this to the fact that we’re also a sharing is caring kind of community who will often share our answers to complex problems, or work actively to help solve a particular problem.  I’m actually really proud to be a part of IT and how well we want each other to succeed while we continually fight the same problems irrespective of size of organization or where in the world.

 

I’m sure for every single one of us who has a kitchen we may have a cookbook or two on shelves with recipes and ‘answers’ to that inevitable question of how we go about making certain foods or meals.   What are some of your recipes you’ve discovered or solved over the time of your careers to help bring about success.  I personally always enjoyed taking complex scripts for managing VMware and converting them into ‘one-liners’ which were easy to understand and manipulate, both as a way for others to learn how to shift and change them, but also so I could run reports which were VERY specific to my needs at the moment while managing hundreds and thousands of datacenters.

 

I’d love it if you’d share some of your own personal stories, recipes or solutions and whether this analogy has been helpful if not explaining perhaps what we do in IT to our family who may not understand, but maybe in cracking the code on your next Systems Management or process challenge!

 

(For a link to my One-Liners post check out; PowerCLI One-Liners to make your VMware environment rock out! )

Does anyone else remember January 7th, 2000? I do.  That was when Microsoft announced the Secure Windows Initiative.  An initiative dedicated to making Microsoft products secure. Secure from malicious attack. Where secure code was at the forefront of all design practices so that way practitioners like us wouldn’t have to worry about having to patch our systems every Tuesday.  So we wouldn’t need to worry about an onslaught of viruses and malware and you name it!    That is not to say that prior to 2000 that they were intentionally writing bad code, but it is to say that they’re making a hearty and conscious decision to make sure that code IS written securely.   So was born the era of SecOps.

 

16 years has passed, and not a year has gone by in that time where I haven’t heard organizations (Microsoft included) say, “We need to write our applications securely!” like it is some new idea that they’ve discovered for the first time.   Does this sound familiar to your organizations, to your businesses and processes you have to work with?  Buzzwords come out in the market place, make it into a magazine.  Perhaps new leadership or new individuals come in and say, “We need to be doing xyzOps! Time to change everything up!”

 

But to what end do we go through that?  There was a time when people adopted good, consistent and solid practices, educated and trained their employees, well refined processes which align with the business, and technology which isn’t 10 years out of date which allows you to handle and manage your requirements.   But we could toss that out the window so we can adopt the flavor of the week as well, may as well, right?

 

That said though, some organizations, businesses or processes receive nothing but the highest accolades and benefits by adopting a different or strict regime for how they handle things.  DevOps for the right applications or organization may be the missing piece of the puzzle which could enable agility or abilities which truly were foreign prior to that point.

 

If you can share what tools did you find which brought about your success or were less than useful in realizing the dream ideal state of xyzOps.   I’ve personally found that having buy-in and commitment throughout the organization was the first step in success when it came to adopting anything which touches every element of a transformation.

 

What are some of your experiences, of organizational shifts, waxing and waning across technologies. Where it was successful, and where it was wrought with failure like adopting ITIL in its full without considering for what it takes to be successful.  Your experiences are never more important than now to show others what the pitfalls are, how to overcome challenges, and also where things tend to work out well and where they fall short.

 

I look forward to reading about your successes and failures so that we can all learn together!

Corporate Dilemma.jpg

 

 

Have you ever faced this problem before?   Whether you are brand new to the industry, technology or a position; or you have been fighting out issues, troubleshooting in the trenches for decades upon decades. This problem often rears its ugly head, often in the form of “We don’t have the budget to train our employees”, or as the comic captioned above so eloquently puts it, “If we train our employees they might leave.”

 

This is a quandary I personally have experienced in my life time and one I know technician, to engineer, to architect, to all roles within an organization who have equally faced. It is not to say that all organizations suffer this same fate, technology vendors often ones who have their own certification program tend to support education, training, certification. Partner or Resellers often tend to support and embrace education within the workplace.  Even furthermore organizations where they dictate requirements around “Your position must have ‘x’ education/certification in order to advance within the organization”.  It would be awkward to have requirements yet no support of the organization to achieve them.

 

Yet even given those few scenarios where I have seen them be supportive of training, I could pull a dozen Administrators into a room, System, Network, SysOps and DevOps and probably 2 to 3 out of each of those groups would say their organization supports education, whether financially, providing time, or training and resources to pursue this education.

 

I can think of more admins than not who spend countless hours educating themselves, searching out and researching problems, constantly staying up on tomorrows technology while supporting the systems of yesterday, even those who regularly read up on and comment on forums like this within the Thwack community.   You are the heroes, the rock stars, those who in spite of an organizations support of your actions you continue to pursue your own evolution.

 

I’ve published in the past countless resources for others to educate themselves, free or discounted certification programs, even one I’ll mention here where SolarWinds has a free training & certification program to become a SolarWinds Certified Professional  (SCP) Interested in becoming a SolarWinds Certified Professional FOR FREE?!?!?

(I just checked and it seems to still work [Someone let me know if it's not still free!], two years on from when I wrote this blog post, so worth adding to your arsenal if you’re Cert limited and want to know what it feels like to have resources to learn something and then get a cert around it )   But enough about me!

 

What are some of the ways you the people are able to keep up to date on things, to continually grow and educate yourselves.   Share your experiences of how you cope with organizations who do not support your advancement for fear you may ‘jump ship’ with your new knowledge.   Or if you are some of the few who have a supportive organization whether Company, Partner, Vendor who gives you the fuel to light the fire of your mind, or just to continue to support and maintain your existing environments.   And anywhere in between.

The other day we were discussing the fine points of running an IT Organization and the influence of People, Process and Technology on Systems Management and Administration, and someone brought up one of their experiences.   Management was frustrated at how it would take days for snapshots on their storage and virtualization platform was looking to replace their storage platform to solve this problem.  Clearly as this was a technology problem they sought out a solution which would tackle this and address the technology needs of their organization!  Chances are one or more of us have been in this situation before, so they did the proper thing and looked at the solutions!  Vendors were brought in, solutions spec’d, technical requirements were established and features were vetted.  Every vendor was given the hard and fast requirements of “must be able to take snapshots in seconds and present to the operating system to use in a writable fashion”.  Once all of the options were reviewed, confirmed, demo’d and validated they had made a solid solution!

 

Months followed as they migrated off of their existing storage platform onto this new platform, the light at the end of the tunnel was there, the panacea to all of their problems was in sight! And finally, they were done. Old storage system was decommissioned and the new storage system was put in place.  Management patted themselves on the back and they went about dealing with their next project, first and foremost on that list was the instantiation of a new Dev environment which would be based off of their production SAP data.   This being a pretty reasonable request they proceeded following their standard protocol to get it stood up, snapshots taken and presented.  Several days later their snapshot was presented as requested to the SAP team in order to stand up this Dev landscape.  And management was up in arms!

 

What exactly went wrong here? Clearly a technology problem had existed for the organization and a technology solution was delivered to act on those requirements.   Yet had they taken a step back for a moment and looked at the problem for it’s cause and not its symptoms they would have noticed that their internal SLAs and processes are really what was at fault, not the choice of technology.  Don’t get me wrong, some technology truly is at fault and a new technology can solve it, but to say that is the answer to every problem would be untrue, and some issues need to be looked at in the big picture.   To give you the true cause of their problem as their original storage platform COULD have met the requirements; was their ticketing process required multiple sign-offs for Change Advisory Board Management, approval and authorization, and the SLAs given to the storage team involved a 48-hour response time.  In this particular scenario the Storage Admins were actually pretty excited to present the snapshot so instead of waiting until the 48th hour to deliver, the provided it within seconds of the ticket making it into their queue.

 

Does this story sound familiar to you or your organization? Feel free to share some of your own personal experiences where one aspect of People, Process or Technology was blamed for the lack of agility in an organization and how you (hopefully) were able to overcome it?  I’ll do my best to share some other examples, stories and morals over these coming weeks!

 

I look forward to hearing your stories!

Filter Blog

By date: By tag: