December Writing Challenge Week 5 Recap

 

And with these last four words, the 2019 writing challenge comes to a close. I know I’ve said it a couple of times, but even so, I cannot express enough my gratitude and appreciation for everyone who took part in the challenge this year—from the folks who run the THWACK® community, to our design team, to the lead writers, to the editing team, and, of course, to everyone who took time out of their busy day (and nights, in some cases) to thoughtfully comment and contribute.

 

Because we work in IT, and therefore have an ongoing love affair with data, here are some numbers for you:

 

The 2019 December Writing Challenge

  • 31 days, 31 words
  • 29 authors
    • 13 MVPs
  • 14,000 Views
  • 960 Comments

 

It’s been an incredible way to mentally pivot from the previous year, and set ourselves up for success, health, and joy in 2020.

 

Thank you again.

  • - Leon

 

Day 28. Software Defined Network (SDN)

THWACK MVP Mike Ashton-Moore returned to the ELI5 roots, by crafting his explanation using the xkcd Simple Writer online tool. Despite limiting himself to the first 1,000 (or “ten-hundred,” to use the xkcd term) words, Mike created a simple and compelling explanation.

 

George Sutherland Dec 29, 2019 8:49 PM

SDN is the network version of the post office. You put your package in the system and let the post office figure out the best and fastest mode of delivery.

 

Ravi Khanchandani  Dec 30, 2019 12:58 AM

Life comes a full circle with SDN, from centralized routing to distributed routing and back to centralized routing.

The SDN controller is like the Traffic Police HQ that sends out instructions to the crossings or edge devices to control the traffic. How much traffic gets diverted to what paths, what kind of traffic goes which path, who gets priority over the others. Ambulances accorded highest priority, trucks get diverted to the wider paths, car pools & public transport get dedicated lanes, other cars get a best effort path

 

Juan Bourn Dec 30, 2019 10:37 AM

I gotta admit, I had to read this twice. Not being very familiar with SDN prior to this, I didn’t understand the special boxes and bypassing them lol. I couldn’t make the relationship tangible. But after a second read through, it made sense. Good job on making it easy to understand. You can’t do anything about your audience, so no knock for my inability to understand the first time around!

 

Day 29. Anomaly detection

As product marketing manager for our security portfolio, Kathleen Walker is extremely well versed in the idea of anomaly detection. But her explanation today puts it in terms even non-InfoSec folks can understand.

 

Vinay BY  Dec 30, 2019 5:06 AM

As the standard definition states -> “anomaly detection is the identification of rare items, events or observations which raise suspicions by differing significantly from the majority of the data”

 

Something unusual from the data pattern that you see on a regular basis, this as well helps you to dig down further to understand what happened exactly and why was it so. Anomaly detection can be performed in several areas, basically performed before aggregating the data into your system or application.

 


Thomas Iannelli
  Dec 30, 2019 5:27 AM

We don’t have kids living with us, but we do the same thing for our dog, Alba, and she for us. We watch her to make sure she eats, drinks, and performs her biological functions. When one of those things is off we either change the diet or take her to the vet. She watches us. She even got used to my wife, who works at home, going into her office at certain times, taking a break at certain times to put her feet up, or watch TV during lunch. So much so that Alba will put herself in the rooms of the house before my wife. She does it just so casually. But when my wife doesn’t show up, she frantically goes thru the house looking for her. Why isn’t she where she is supposed to be. Alba does the same thing when I go to work. It is fine that I am leaving during the week. She will not fuss and sometimes will greet me at the door. But if I get my keys on the weekend or in the evening she is all over me wanting to go for the ride. There is a trip happening out of the ordinary. When we have house guests, as we did over the holiday, she gets very excited when they arrive, and even the next morning will try to go to the guest bedroom and check to make sure they are still here. But after a day or two it is just the new normal. Nothing to get too excited about. The anomaly has become the norm.

 

I guess the trick is to detect the anomaly and assess quickly if it is outlier, if it is going to be the new normal, or if it is a bad thing that needs to be corrected.

 

Paul Guido  Dec 30, 2019 9:32 AM (in response to Charles Hunt)

cahunt As soon as I saw the subject, I thought of this song. “One of these things is not like the other” is one of my primary trouble-shooting methods to this very day.

 

Once I used the phrase that “The systems were nominal” and people did not understand the way I used “nominal.” I was using it in the same way that NASA uses it in space systems that are running within specifications.

 

In my brain, an anomaly is outside the tolerance of nominal.

 

Day 30. AIOps

Melanie Achard returns for one more simple explanation, this time of a term heavily obscured by buzzwords, vendor-speak, and confusion.

 

Vinay BY  Dec 30, 2019 10:21 AM

AIOps or artificial intelligence for IT operations includes the below attributes in one or the other way, to me they are all interlinked:

Proactive Monitoring

Data pattern detection, Anomaly detection, self-understanding and Machine Learning which improvises the entire automation flow

Events, Logs, Ticket Dump

Bots & Automation

Reduction of Human effort, cost reduction, time reduction and service availability

 

Thomas Iannelli  Dec 30, 2019 5:41 AM

Then the computer is watching and based on either machine learning or anthropogenic algorithms process the data for anomaly detection and then takes some action. In the form of an automated response to remediate the situation or to alert a human that something here needs you to focus attention on it. Am I understanding correctly?

 

Jake Muszynski Dec 30, 2019 12:13 PM

Computers don't lose focus, my biggest issue with people reviewing the hordes of data that varous monitors create is that they get distracted, they only focus on the latest thing. AI helps by looking at all the things, then surfacing what might need attention. In a busy place, it can really make a difference.

 

 

Day 31. Ransomware

On our final day of the challenge, THWACK MVP Jeremy Mayfield spins a story bringing into sharp clarity both the meaning and the risk of the word of the day.

 

Faz f Dec 31, 2019 5:57 AM

This is when your older Sibling/friend has your toy and will not give it back unless you do something for them. It's always good to keep your toys safe.

 

Michael Perkins Dec 31, 2019 11:36 AM

Ransomware lets a crook lock you out of your own stuff, then make you pay whatever the crook wants for the key. This is why you keep copies of your stuff. It takes away the crook's leverage and lets you go "Phbbbbbbbbbbt!" in the crook's face.

 

Brian Jarchow Dec 31, 2019 12:38 PM

Ransomware reminds me of the elementary school bully who is kind enough to make sure you won't get beaten up if you give him all of your lunch money.

The year is winding down, and—while it’s not something I do every year—I thought I’d take a moment to look ahead and make a few educated guesses about what the coming months have in store for us nerds, geeks, techies, and web-heads (OK, the last category is for people from the Spider-verse, but I’m still keeping them in the mix.)

 

As with any forward-looking set of statements, decisions made based on this information may range from “wow, lucky break” to “are you out of your damn mind?” And, while I could make many predictions about the national (regardless of which nation you live in) and/or global landscape as it relates to economy, politics, entertainment, cuisine, alpaca farming, etc., I’m going to keep my predictions to tech.

 

Prediction 1: The Ever-Falling Price of Compute

This one is a no-brainer, honestly. The cost of compute workloads is going to drop in 2020. This is due to the increased efficiencies of hardware and the rising demand for computer resources—especially in the cloud.

 

I can also make this prediction because it’s basically been true for the last 30 years.

 

With that said, it’s worth noting—according to some sources (https://www.quantumrun.com/future-timeline/2020/future-timeline-subpost-technology)—the following milestones/benchmarks will be reached:

  • (Moore’s Law) Calculations per second, per $1,000, will reach 10^13 (equivalent to one mouse brain)
  • Average number of connected devices, per person, is 6.5
  • Global number of internet-connected devices reaches 50,050,000,000
  • Predicted global mobile web traffic equals 24 exabytes
  • Global internet traffic grows to 188 exabytes

  style="margin-bottom:.0001pt"

  • Share of global car sales taken by autonomous vehicles will be about 5%
  • World sales of electric vehicles will reach 6,600,000

In addition, in 2017, Elon Musk posited it would take 100 square miles of solar panels total to provide all the electricity used in the U.S. on an average day. https://www.inverse.com/article/34239-how-many-solar-panels-to-power-the-usa. In 2018, freeenergy.com took another swipe at it and figured the number slightly higher—21,500 sq. miles. But that’s still 0.5% of the total available land in the U.S. and amounts to (if you put it all in one place, which you would not) a single square of solar panels 145 miles on each side. https://www.freeingenergy.com/how-much-solar-would-it-take-to-power-the-u-s/.

 

What I’m getting at is that the impending climate crisis and the improving state of large-storage batteries and renewable energy sources may push the use of environmentally friendly transportation options even further than expected. If nothing else, these data points will provide background to continue to educate everyone across the globe about ways to make economically AND ecologically healthy energy choices.

 

*Ra’s Al Ghul to Bruce Wayne, “Batman Begins”

 

Prediction 4: Say “Blockchain” One. More. Time.

Here’s a non-prediction prediction: People (mostly vendors and dudes desperate to impress the laydeez) are going to keep throwing buzzwords around, making life miserable for the rest of us.

 

HOWEVER, eventually enough of us diligent IT folks nail down the definition so the hype cycle quiets down. In 2020, I think at least a few buzzwords will get a little less buzz-y.

 

One of those is “AI” (artificial intelligence). IT professionals and even business leaders are finally coming to grips with how this ISN’T (androids like Data; moderately complex algorithms; or low-paid offshore workers doing a lot of work without credit) and will be more clearly be able to understand when true AI is both relevant and necessary.

 

Closely related, machine learning (the “ML” in the near-ubiquitous “AI/ML” buzzword combo) will also reach a state of clarity, and businesses wanting to leverage sophisticated automation and behavioral responses in their products will avoid being caught up (and swindled) by vendors hawking cybernetic snake oil.

 

Finally, the term 5G is going to get nailed down and stop being seen as “Better because it’s got one more G than what I have today.” This is more out of necessity than anything else, because carriers are building out their 5G infrastructure and selling it, and the best cure for buzzword hype are vendor contracts clearly limiting what they’re legally obligated to provide.

 

Prediction 5: Data As A Service

While this effort was well under way in 2019 from the major cloud vendors, I believe 2020 is when businesses will, en masse, take up the challenge of building both data collection and data use features into their systems. From the early identification of trends and fads; to flagging public health patterns; to data-based supply chain decisions—the name of the game is to use massive data sets to analyze complex financial behaviors and allow businesses to react more accurately and effectively.

 

Again, this isn’t so much the invention of something new as it is the adoption of a capability providers like AWS and Azure have made available in various forms since 2018 and putting it to actual use.

 

Prediction N: We’re So Screwed

Security? Privacy? Protection of personal information? Everything I described above—plus the countless other predictions which will come out to be true in the coming year—is going to come at the cost of your information. Not only that, but the primary motivator in each of those innovations and trends is profit, not privacy. Expect a healthy helping of hacks, breaches, and data dumps in 2020.

 

Just like last year.

Omar Rafik, SolarWinds Senior Manager, Federal Sales Engineering

 

Here’s an interesting article by my colleague Jim Hansen where he provides tips on leveraging automation to improve your cybersecurity, including deciding what to automate and what tools to deploy to help.

 

Automation can reduce the need to perform mundane tasks, improve efficiency, and create a more agile response to threats. For example, administrators can use artificial intelligence and machine learning to ascertain the severity of potential threats and remediate them through the appropriate automated responses. They can also automate scripts, so they don’t have to repeat the same configuration process every time a new device is added to their networks.

 

But while automation can save enormous amounts of time, increase productivity, and bolster security, it’s not necessarily appropriate for every task, nor can it operate unchecked. Here are four strategies for effectively automating network security within government agencies.

 

1. Earmark What Should—And Shouldn’t—Be Automated.

 

Setting up automation can take time, so it may not be worth the effort to automate smaller jobs requiring only a handful of resources or a small amount of time to manage. IT staff should also conduct application testing themselves and must always have the final say on security policies.

 

Security itself, however, is ripe for automation. With the number of global cyberattacks rising, the challenge has become too vast and complex for manual threat management. Administrators need systems capable of continually policing their networks, automatically updating threat intelligence, and monitoring and responding to potential threats.

 

2. Identify the Right Tools.

 

Once the strategy is in place, it’s time to consider which tools to deploy. There are several security automation tools available, and they all have different feature sets. Begin by researching vendors with a track record of government certifications, such as Common Criteria, or are compliant with the Defense Information Systems Agency requirements.

 

Continuous network monitoring for potential intrusions and suspicious activity is a necessity. Being able to automatically monitor log files and analyze them against multiple sources of threat intelligence is critical to being able to discover and, if necessary, deny access to questionable network traffic. The system should also be able to automatically implement predetermined security policies and remediate threats.

 

3. Augment Security Intelligence.

 

Artificial intelligence and machine learning should also be considered indispensable, especially as IT managers struggle to keep up with the changing threat landscape. Through machine learning, security systems can absorb and analyze data retrieved from past intrusions to automatically and dynamically implement appropriate responses to the latest threats, helping keep administrators one step ahead of hackers.

 

4. Remember Automation Isn’t Automatic.

 

The old saying “trust but verify” applies to computers as much as people. Despite the move toward automation, people are and will always be an important part of the process.

 

Network administrators must conduct the appropriate due diligence and continually audit, monitor and maintain their automated tasks to ensure they’re performing as expected. Updates and patches should be applied as they become available, for example.

 

Automating an agency’s security measures can be a truly freeing experience for time- and resource-challenged IT managers. They’ll no longer have to spend time tracking down false red flags, rewriting scripts, or manually attempting to remediate every potential threat. Meanwhile, they’ll be able to rest easy knowing the automated system has their backs and their agencies’ security postures have been improved.

 

Find the full article on Government Computer News.

 

The SolarWinds trademarks, service marks, and logos are the exclusive property of SolarWinds Worldwide, LLC or its affiliates. All other trademarks are the property of their respective owners.

One of the things I like most about the writing challenge is we’ve set it at a time when many of us are either “off” (because how many of us in tech are ever REALLY “off”) or at least find ourselves with a few extra compute cycles to devote to something fun. This week, more than any so far, has shown this to be true.

 

Despite a conspicuous absence of references to brightly colored interlocking plastic blocks, our ELI5 imaginations ran wild, from tin can telephones to poetry (with and without illustration) to libraries.

 

I’m thrilled with how the challenge has gone so far, and what other examples are yet to come as we finish strong next week.

 

21. Routing

Kevin Sparenberg—former SolarWinds customer, master of SWUG ceremonies, semi-official SolarWinds DM, and owner of many titles both official and fictitious—takes the idea of routing back to its most fundamental and builds it back up from there.

 

Mike Ashton-Moore  Dec 22, 2019 10:43 AM

I love this challenge—all these definitions that I can now use when a non-geek asks me what I do.

 

For adults—so not five-year olds, I always reverted to the sending a letter through the post office, which seems to cover it

 

Jeremy Mayfield  Dec 23, 2019 9:21 AM

Holly Baxley Dec 23, 2019 10:37 AM (in response to Kevin M. Sparenberg)

I use that same analogy when I explain to my unfortunate Sales Agents who work in neighborhoods with a shared DSL line. I always get calls around the holidays that the internet in their model has suddenly started crawling. It’s too hard for me to explain shared DSL lines and that we have no control over what’s put under our feet when we build houses, and that ISP’s with older cable lines will “store” up a certain amount of data per neighborhood—and depending on how heavily it’s used during the day—it can make all the difference with those…speeds UP TO 75 MBps.”

 

If I tried to explain how the old neighborhood DSL’s route and “borrow” data when it’s not being used by someone else, their heads would explode.

 

So, I use our highway as an example.

 

“You know how the internet’s called an Information Highway?” Well, in your neighborhood it works a lot like that. During the day, your speeds are okay because most people are at work and school. They’re not on your “information highway.” But when the holidays hit, you got kids at home streaming and gaming and suddenly your own internet’s gonna drop because now many people are on your “highway.” Just like when you get on the highway to go home—if you don’t have many people on the road, you can go the 60 – 70 mph that you’re allowed on the posted signs. But if it’s the rush hour—and cars are jammed for miles—it doesn’t matter if the posted signs say “60 mph”—you’re gonna go the same crawling 30 mph that everyone else is, because it’s jammed.

 

Right now—you got a heavy “rush hour” on your DSL line because there’s a lot of people in your neighborhood on it.”

 

What I wouldn’t give to have us all on fiber.

 

But such is the life of a home builder.

 

  1. Speedtest.net ... you’re my only friend.

I also think about it as getting to work. I know the preferred path, but due to insane drivers outside of my control, I’m sometimes forced to take alternate paths to get to the same place. If there’s an accident on the main road I take, US 301, then I might have to take the interstate 75 which is often flowing smoothly until it gets backed up then I might need to take the turnpike. Luckily for me, there’s almost no time difference in getting from home to work and vice versa, but at the end of the day, I’m only able to measure the difference in distance traveled. It’s more miles to use the I-75 and/or turnpike. So, my route is within minutes of each other, but the distance traveled to get there is much greater when I don’t get to use my preferred path.

 

22. Ping

When a few folks here at SolarWinds began talking about “NetFlowetry”—mostly as a silly idea—we had no idea how it would take off. THWACK MVP Thomas Iannelli’s entry shows how much the idea has caught on, and how well it can be used to make a challenging concept seem accessible.

 

Shmi Ka  Dec 23, 2019 6:30 AM

This is so wonderful! This is so great for non-experts in this subject. Your poem is full of visual words for a visual learner like me. Thank you!

 

Rick Schroeder  Dec 26, 2019 12:52 PM

We rely on ping for a lot, but we as Network Analysts understand much about pings that many other folks may not. For example, a switch or router may be humming along, working perfectly, forwarding and routing packets for users without a single issue. But pinging that switch or router may not be the best way to discover latency between that switch and any other device. This is because ICMP isn’t as important to forward or respond to as TCP traffic.

 

A switch or router “knows” its primary job is to forward data and reply to pings as fast as possible just isn’t as important as moving TCP packets between users. So, a perfectly good network and set of hardware may serve users quite well, but might simultaneously show varying amounts of latency. It’s because we may be monitoring a switch that’s busy doing other things; when it gets a free microsecond, it might reply to our pings. Or it might not. And users aren’t experiencing slowness or outages when the switch starts showing higher latency than it did when there was very little traffic going through it.

 

It’s important to not place excessive reliance on pings “to” routers or switches for this very reason.

 

However, you might just find pings more valuable if you ping from endpoint to endpoint instead of from monitoring station to switch or routers. The switch or router will forward the ICMP traffic nicely, and may do so much better than it will REPLY to the pings.

 

So, ping from a workstation to another workstation, or to a server, or server to server, instead of from a workstation to a router or switch that might have better things to do with its processing resources than reply to your ping quickly.

 

Greg Palgrave Dec 22, 2019 9:56 PM

Give me a ping, Vasili. One ping only, please.

 

23. IOPS

When explaining the speed of reads and writes, most people’s minds wouldn’t think about libraries. But THWACK MVP Jake Muszynski isn’t like most people, and his example was brilliantly, elegantly simple.

 

Tregg Hartley Dec 23, 2019 10:33 AM

Reads and writes per second

Is a metric measured here?

Where is the system bottleneck

Of our data we hold so dear?

 

Vinay BY  Dec 23, 2019 11:26 AM

IOPS—Read and write without any latency, most of us would want the data on our screen in split seconds and IOPS does contribute to it, we would always love to keep this as healthy as possible, with data pouring in we need to keep these things at scale -> IOPS, data storage, data retrieval and throughput.

 

George Sutherland Dec 23, 2019 12:14 PM

Well said sir... I love the book analogy.

 

It’s also like a puzzle... except that you get the same picture but the number of pieces in box change... sometimes 50, others 100, others 500 and some even 1000 pieces. Same view just more to consider.

 

Or when I mentioned your post to an accountant friend of mine.... debits=credits!!!!

 

24. Virtual Private Network (VPN)

THWACK MVP Matthew Reingold finds what is perhaps the most amazing, most simple, and most accurate ELI5 explanation for virtual private networks I’ve ever seen. You can bet I will be adding it to my mental toolbox.

 

Beth Slovick Dec 24, 2019 4:16 AM

We use VPNs for everything from connecting to the office to protecting our torrent downloads from nosy ISPs. Everyone uses a VPN these days to encrypt and protect their information from prying eyes.

 

Kelsey Wimmer Dec 24, 2019 10:59 AM

You could also describe it as using a water hose through a pool. You get to go through the pool, but the hose hides your data and what comes out of the pool is only what has gone through your hose.

 

Tregg Hartley Dec 24, 2019 11:17 AM

Open a connection between me and you

Encrypt the data before it goes through,

Then the only people who can see

The flowing data is you and me.

 

 

25. Telemetry

The word telemetry is still obscured by a healthy dose of “hand wavium” from companies and individuals who don’t understand it but want to sound impressive. – Josh Biggley, who has devoted a good portion of his career to both building systems to gather and present telemetry data; and clarifying what the word means.

 

Vinay BY  Dec 26, 2019 9:13 AM

To me, telemetry is to reach to a point/milestone where normal/generic process/procedure can’t -> be it collecting data, be it monitoring, be it inducing instructions or any other possible thing.

 

Juan Bourn Dec 26, 2019 11:05 AM

If telemetry can tell a pit crew in NASCAR exactly how the race car is behaving, it can do the same (and possibly more) for us. The idea is, as mentioned by the author, to remove the noise. What do we care about? What matters? What is measurable vs. what is observable? Finally, how do we put that into a dashboard that we can use to have an overview of everything at once? That’s where telemetry really is useful, the combined overview of all our metrics.

 

Brian Jarchow Dec 26, 2019 4:44 PM

I’ve known people who worked on Boeing’s Delta program and the SpaceX Falcon 9 program. In rocketry, a lot of telemetry data is the difference between “it exploded” and “here’s what went wrong.”

 

26. Key Performance Indicator (KPI)

If Senior UX researcher Rashmi Kakde ever thought about a second career, I’d suggest writing and illustrating tech books for kids. Her poetic story about KPI is something I plan to print and use often.

 

Jake Muszynski  Dec 26, 2019 10:24 AM

I have started working with KPI’s that I track for the Orion® Platform. As I delegate work to others or if I get distracted (when) I need an easy way to verify that the Orion Platform is doing what I expect it to. I have overall system health from App monitors and the “my Orion deployment” page, but what about all those things that are more like house cleaning? Things like custom properties. Unknown devices. Nodes missing polls. I build out dashboards and reports to let me know how the processes I have in place (both automated and human) are getting things done. I pull them into a PowerShell monitor from SAM via SWQL queries.

Did I have a spike in unmanaged devices? Do I need to find out why?

Do all my Windows servers have at least one disk?
Are there disks that need to be removed?

Not all of them are important, at least not right now. But once I gather stats on what we need to clean up to be current, then I choose a few significant metrics to improve. Those are my KPI’s. I look at the number for a quarter and try to improve the process and the automation to make sure stuff doesn’t fall between the cracks. And having stats over time mean that I can see if thing change and need my attention. If I make a few things better, and other stuff suffers, I change my KPI’s.

 

Mike Ashton-Moore  Dec 26, 2019 12:18 PM

I love, love, love this one, especially the pictures

 

For me the most important part of KPIs is to try to refer to them by their full name rather than the TLA.

Key

Performance

Indicator

 

I’ve seen several “service desk” systems that try to label ticket close rates as a KPI.

Where something is measured because it was easy to measure, not because it indicates how well the service desk is being run.

That it isn’t a Key Performance Indicator any more than MPG is a KPI for how comfortable a car is.

It’s just an interesting statistic, not a KPI.

 

We need to remember what Performance our Indicator is Key for highlighting to us and why it is important enough to make it “Key.”

 

Michael Perkins Dec 26, 2019 2:55 PM

The trick with KPIs is figuring out what is actually “key” and observable to system performance. Of course, one must begin by asking what it means that the system is performing well.

 

I was laid off years ago because someone “upstairs” decided to change what was key without telling anyone. I was laid off for handling fewer tickets than my colleagues. For months, if not a couple of years, I had been an unofficial escalation point—working high-priority tickets and customers. That took—with explicit approval from managers with whom I shared space—more time than ordinary tickets, so I handled fewer overall. I also would help colleagues if they had questions.

 

Well above those folks, it was decided that my group would have one KPI—number of tickets processed. On Friday going into Labor Day Weekend that year, I was working with a customer, who thanked me profusely, when I heard my manager (two levels above me), getting rather upset. I found out later that was when higher-ups told him I was getting laid off. I found out about 20 – 30 minutes later.

 

So, was processing tickets quickly the KPI? Should it have been combined with, say, customer satisfaction, perhaps measured via survey? What about some sort of metric in which the severity or difficulty of the tickets was taken into account? What was really key to the support desk’s performance?

 

27. Root Cause Analysis

Principal UX researcher Kellie Mecham is trying to inspire an entire new generation of UX/UI folks with her explanation, by pointing out the ability to ask questions a core skill. By way of example, she shows how enough “why” questions can uncover the root cause of any situation.

 

Richard Phillips  Dec 27, 2019 9:12 AM

Root cause analysis is critical to understanding the past and the why did that happen. Along with RCA I like to include the how questions of How can we prevent that in the future and How can we use this information to make things better, faster, more resilient. When asked for the root cause I like to provide not just the answer, but the value obtained from that answer.

 


Tregg Hartley
Dec 27, 2019 10:37 AM

Getting to the bottom of things

Is what we are looking for,

Diagnose the disease

Lease the symptoms at the door.

 

 

Brian Jarchow Dec 27, 2019 11:03 AM

Unfortunately, I have worked with people who would then take it to the level of: “Why do we need to pay? Why can’t we just have?”

 

A root cause analysis can only go so far, and some people have difficulty with reasonable limits.

 

It's been a few weeks since VMworld Europe, and that's given Sascha and me a chance to digest both the information and the vast quantities of pastries, paella, and tapas we consumed.

 

VMworld was held again in Barcelona this year and came two months after the U.S. convention, meaning there were fewer big, jaw-dropping, spoiler-filled announcements, but more detail-driven, fill-in-the-gaps statements to clarify VMware's direction and plans.

 

As a refresher, at the U.S. event, some of the announcements included:

  • VMware Tanzu – a combination of products and services leveraging Kubernetes at the enterprise level.
  • Project Pacific – related to Tanzu, this will turn vSphere into a Kubernetes native platform.
  • Tanzu Mission Control – will allow customers to manage Kubernetes clusters regardless of where in the enterprise they're running.
  • CloudHealth Hybrid – will let organizations update, migrate, and consolidate applications from multiple points in the enterprise (data centers, alternate locations, and even different cloud providers) as part of an overall cloud optimization and consolidation strategy
  • The intent to acquire Pivotal
  • The intent to acquire Carbon Black

 

Going into the European VMworld, one could logically wonder what else there was to say about things. It turns out there were many questions left hanging in the air after the booths were packed and the carpet pulled up and in San Francisco.

 

Executive Summary

VMware, since selling vCloud to OVH, started looking into other ways to diversify their business and embrace the cloud. The latest acquisitions show it’s a vision, and their earning calls show it’s a successful one. (https://ir.vmware.com/overview/press-releases/press-release-details/2019/VMware-Reports-Fiscal-Year-2020-Third-Quarter-Results/default.aspx)

 

Tanzu

At both the U.S. and Europe conventions, Tanzu was clearly the linchpin initiative around which VMware's new vision for itself revolves. While the high-level sketch of Tanzu products and services was delivered in San Francisco, in Barcelona we also heard:

  • Tanzu Mission Control will allow operators to set policies for access, backup, security, and more to clusters (either individual or groups) across the environment.
  • Developers will be able to access Kubernetes resources via APIs enabled by Tanzu Mission Control.
  • Project Pacific does more than merge vSphere and Kubernetes. It allows vSphere administrators to use tools they’re already familiar with to deploy and manage container infrastructures anywhere vSphere is running—on-prem, in hybrid cloud, or on hyperscalers.
  • Conversely, developers familiar with Kubernetes tools and processes can continue to roll out apps and services using the tools THEY know best and extend their abilities to provision to things like vSphere-supported storage on-demand.

 

The upshot is Tanzu and the goal of enabling complete Kubernetes functionality is more than a one-trick-pony idea. This is a broad and deep range of tools, techniques, and technologies.

 

Carbon Black

In September we had little more than the announcement of VMware's "intent to acquire" Carbon Black. By November the ink had dried on that acquisition and we found out a little more.

  • Carbon Black Cloud will be the preferred endpoint security solution for Dell customers.
  • VMware AppDefense and Vulnerability Management products will merge with several modules acquired through the Carbon Black acquisition.

 

While a lot more still needs to be clarified (in the minds of customers and analysts alike), this is a good start in helping us understand how this acquisition fits into VMware's stated intent of disrupting the endpoint security space.

 

NSX

The week before VMworld US, VMware announced its Q2 earnings, which included NSX adoption had increased more than 30% year over year. This growth explains the VMworld Europe announcement of new NSX distributed IDS and IPS services, as well as "NSX Federation," which let customers deploy policies across multiple data centers and sites.

 

In fact, NSX has come a long way. VMware offers two flavors of NSX: The well-known version, which is meanwhile called NSX Data Center for vSphere, and the younger sibling NSX-T Data Center.

The vSphere version continuously improved in two areas preventing a larger adoption; the user experience and security and is nowadays a matured and reliable technology.

NSX-T has been around for two years or so, but realistically it was always behind in features and not as valuable. As it turns out, things have changed, and NSX-T fits well into the greater scheme of things and is ready to play with the other guys in the park, including Tanzu and HCX.

 

Pivotal

Pivotal was initially acquired by EMC, and EMC combined it with assets from another acquisition: VMware. Next, Dell acquired EMC, and a little later both VMware and Pivotal became individual publicly traded companies with DellEMC remaining as the major shareholder. And now, in 2019, VMware acquired Pivotal.

 

One could call that an on/off relationship, similar to the one cats have with their owners servants. It’s complicated.

 

Pivotal offers a SaaS solution to create other SaaS solutions, a concept which comes dangerously close to Skynet, minus the self-awareness and murder-bots.

 

But the acquisition does makes sense, as Pivotal Cloud Foundry (PCF) runs on most major cloud platforms, and on vSphere, and (to no one's surprise), Kubernetes.

 

PCF allows developers to ignore the underlying infrastructure and is therefore completely independent from the type of deployment. It will help companies in their multi-cloud travels, while still allowing them to remain a VMware customer.

 

New Announcements

With all of that said, we don't want you to think there was nothing new under the unseasonably warm Spanish sun. In addition to the expanded information above, we also heard about a few new twists in the VMware roadmap:

  • Project Galleon will see the speedy delivery of an app catalog with greater security being key.
  • VMware Cloud Director service was announced, giving customers multi-tenant capabilities in VMware Cloud on AWS. This will allow Managed Service Providers (MSPs) to share the instances (and costs) of VMware Cloud on AWS across multiple tenants.
  • Project Path was previewed.
  • Project Maestro was also previewed—a telco cloud orchestrator designed to deliver a unified approach to modelling, onboarding, orchestrating, and managing virtual network functions and services for Cloud Service Providers.
  • Project Magna, another SaaS-based solution, was unveiled. This will help customers build a “self-driving data center” by collecting data to drive self-tuning automations.

 

Antes Hasta Tardes

Before we wrap up this summary, we wanted to add a bit of local color for those who live vicariously through our travels.

 

Sascha loved the “meat with meat” tapas variations and great Spanish wine. Even more so, as someone who lives in rainy Ireland, I enjoyed the Catalan sun. It was fun to walk through the city in a t-shirt while all the locals consider the temperature in November as barely acceptable.

Similarly, Leon, (who arrived in Barcelona three days after it had started snowing back home) basked in the warmth of the region and of the locals willing to indulge his rudimentary Spanish skills; and basked equally in the joy of kosher paella and sangria.

 

Until next time!

Omar Rafik, SolarWinds Senior Manager, Federal Sales Engineering

 

Here’s an interesting article by my colleague Jim Hansen where he discusses some ideas on improving agency security, including helping your staff develop cyberskills and giving them the tools to successfully prevent and mitigate cyberattacks.

 

Data from the Center for Strategic and International Studies paints a sobering picture of the modern cybersecurity landscape. The CSIS, which has been compiling data on cyberattacks against government agencies since 2006, found the United States has been far and away the top victim of cyber espionage and cyber warfare.

 

These statistics are behind the Defense Department’s cybersecurity strategy for component agencies that details on how they can better fortify their networks and protect information.

 

DoD’s strategy is built on five pillars: building a more lethal force, competing and deterring in cyberspace, strengthening alliances and attracting new partnerships, reforming the department, and cultivating talent.

 

While aspects of the strategy don’t apply to all agencies, three of the tactics can help all government offices improve the nation’s defenses against malicious threats.

 

Build a Cyber-Savvy Team

 

Establishing a top-tier cybersecurity defense should always start with a team of highly trained cyber specialists. There are two ways to do this.

 

First, agencies can look within and identify individuals who could be retrained as cybersecurity specialists. Prospects may include employees whose current responsibilities feature some form of security analysis and even those whose current roles are outside IT. For example, the CIO Council’s Federal Cybersecurity Reskilling Academy trains non-IT personnel in the art and science of cybersecurity. Agencies may also explore creating a DevSecOps culture intertwining development, security, and operations teams to ensure application development processes remain secure and free of vulnerabilities.

 

Second, agencies should place an emphasis on cultivating new and future cybersecurity talent. To attract new talent, agencies can offer potential employees the opportunity for unparalleled cybersecurity skills training, exceptional benefits, and even work with the private sector. The recently established Cybersecurity Talent Initiative is an excellent example of this strategy in action.

 

Establish Alliances and Partnerships

 

The Cybersecurity Talent Initiative reflects the private sector’s willingness to support federal government cybersecurity initiatives and represents an important milestone in agencies’ relationship with corporations. Just recently, several prominent organizations endured what some called the cybersecurity week from hell when multiple serious vulnerabilities were uncovered. They’ve been through it all, so it makes sense for federal agencies to turn to these companies to learn how to build up their own defenses.

 

In addition to partnering with private-sector organizations, agencies can protect against threats by sharing information with other departments, which will help bolster everyone’s defenses.

 

Arm Your Team With the Right Tools

 

It’s also important to have the right tools to successfully prevent and mitigate cyberattacks. Continuous monitoring solutions, for example, can effectively police government networks and alert managers to potential anomalies and threats. Access rights management tools can ensure only the right people have access to certain types of priority data, while advanced threat monitoring can keep managers apprised of security threats in real-time.

 

Of course, IT staff will need continuous training and education. A good practice is implementing monthly or at least bi-monthly training covering the latest viruses, social engineering scams, agency security protocols, and more.

 

The DoD’s five-pillared strategy is a good starting point for reducing the risk of the nation. Agencies can follow its lead by focusing their efforts on cultivating their staff, creating stronger bonds with outside partners, and supporting this solid foundation with the tools and training necessary to win the cybersecurity war.

 

Find the full article on Government Computer News.

 

The SolarWinds trademarks, service marks, and logos are the exclusive property of SolarWinds Worldwide, LLC or its affiliates. All other trademarks are the property of their respective owners.

A few years back I was working at a SaaS provider when we had an internal hackathon. The guidelines were simple: As part of your project you had to learn something, and you had to choose a framework/methodology beneficial to the organization. I was a Windows and VI admin, but along with my developer friend, we wrote an inventory tool that was put into operational practice immediately. I learned a ton over those two days, but little did I know I’d discovered a POwerful SHortcut to advancing my career as well as immediately making my organization’s operations more efficient. What POtent SHell could have such an effect? The framework I chose in the hackathon: PowerShell.

 

A Brief History of PowerShell

Windows has always had some form of command line utility. Unfortunately, those tools never really kept pace and by the mid-2000s, a change was badly needed. Jeffrey Snover led the charge that eventually became PowerShell. The goal was to produce a management framework to manage Windows environments, and as such it was originally used to control Windows components like Active Directory. The Microsoft Exchange UIs were even built on top of PowerShell, but over time it evolved into way more.

 

Today, one of the largest contributors to the PowerShell ecosystem is VMware, who competes with Microsoft in multiple spaces. Speaking of Microsoft, it’s shed its legacy of being walled off from the world and is now a prolific open-source contributor, with one of their biggest contributions being to make PowerShell open-source in 2016. Since being open-sourced, you can run PowerShell on Mac and Linux computers, as well as for managing the big three (AWS, Azure, Google Cloud) cloud providers.

 

Lots of People Are on Board With PowerShell, But Why Do YOU Care?

In an IT operations role, with no prior experience with PowerShell, I was able create a basic inventory system leveraging WMI, SNMP, Windows Registry, and PowerCLI, amongst others. I mention this example again because it demonstrates two of the most compelling reasons to use PowerShell: its low barrier to entry and its breadth and depth.

 

Low Barrier to Entry

We already determined you can run PowerShell on anything, but it’s also been included in all Windows OSs since Windows 7. If you work in an environment with Windows, you already have access to PowerShell. You can type powershell.exe to launch the basic PowerShell command window, but I’d recommend powershell_ise.exe for most folks who are learning, as the lightweight ISE (Integrated Scripting Environment) will give you some basic tools to troubleshoot and debug your scripts.

 

Once you’re in PowerShell, it’s time to get busy. The things performing work in PowerShell are called cmdlets (pronounced command-lets). Think of them as tiny little programs, or functions, to do a unit of work. If you retain nothing else from this post, please remember this next point and I promise you can become effective in PowerShell: all cmdlets take the form of verb-noun and if properly formed, will describe what they do. If you’re trying to figure something out, as long as you can remember Get-Help, you’ll be OK.

 

Here’s the bottom line on having a rapid learning curve: there are a lot of IT folks who don’t have background or experience in writing code. We’re in a period where automation is vitally important to organizations. Having a tool you can pick up and start using on day one means you can increase your skill set and increase your value to the organization easily. Now if only you had a tool that could grow with you as those skillsets evolved…

 

Depth and Breadth

At its most fundamental level, automation is about removing inefficiencies. A solution doesn’t need to be complex to be effective. When writing code in PowerShell, you can string together multiple commands, where the output of one cmdlet is passed along as the input of the next, via a process called the pipeline. Chaining commands together can be a simple and efficient way to get things done more quickly. Keep in mind PowerShell is a full-fledged object-oriented language, so you can write functions, modules, and thousands of lines of code as your skills expand.

 

So, you can go deep on the tool, but you can go wide as well. We already mentioned you can manage several operating systems, but software across the spectrum are increasingly enabling management via PowerShell snap-ins or modules. This includes backup, monitoring, and networking tools of all shapes and sizes. But you’re not limited to tools vendors provide you—you can write your own. That’s the point! If you need some ideas on how you can jumpstart your automation practice, here’s a sampling of some fun things I’ve written in PowerShell: network mapper, port scanner, environment provisioning, ETL engine, and web monitors. The only boundary to what you can accomplish is defined by the limits of your imagination.

 

What’s Next

For some people, PowerShell may be all they need and as far as they go. If this PowerShell exploration just whets your appetite though, remember you can go deeper. Much of automation is going toward APIs, and PowerShell gives you a couple of ways to begin exploring them. Invoke-WebRequest and Invoke-RestMethod will allow you to take your skills to the next step and build your familiarity of APIs and their constructs within the friendly confines of the PowerShell command shell.

 

No matter how far you take your automation practice, I hope you can use some of these tips to kickstart your automation journey.

We’re more than halfway through the challenge now, and I’m simply blown away by the quality of the responses. While I’ve excerpted a few for each day, you really need to walk through the comments to get a sense of the breadth and depth. You’ll probably hear me say it every week, but thank you to everyone who has taken time out of their day (or night) to read, reply, and contribute.

 

14. Event Correlation

Correlating events—from making a cup of coffee to guessing at the contents of a package arriving at the house—is something we as humans do naturally. THWACK MVP Mark Roberts uses those two examples to help explain the idea that, honestly, stymies a lot of us in tech.

 

Beth Slovick Dec 16, 2019 9:46 AM

Event Correlation is automagical in some systems and manual in others. If you can set it up properly, you can get your system to provide a Root Cause Analysis and find out what the real problem is. Putting all those pieces together to set it up can be difficult in an ever-changing network environment. It is a full-time job in some companies with all the changes that go on. The big problem there is getting the information in a timely manner.

 

Richard Phillips  Dec 17, 2019 1:02 PM

She’s a “box shaker!” So am I.

 

Flash back 20 years—a box arrives just before Christmas. The wife and I were both box shakers and proceed to spend the next several days, leading up to Christmas, periodically shaking the box and trying to determine the contents. Clues: 1) it’s light 2) it doesn’t seem to move a lot in the box 3) the only noise it makes is a bit of a scratchy sound when shaken.

 

Finally Christmas arrives and we anxiously open to the package to find a (What was previously very nice) dried flower arrangement—can you imagine what happens to a dried flower arrangement after a week of shaking . . .

Matt R  Dec 18, 2019 12:57 PM

I think of event correlation like weather. Some people understand that dark clouds = rain. some people check the radar. Some people have no idea unless the weather channel tells them what the weather will be.

 

15. Application Programming Interface (API)

There’s nobody I’d trust more to explain the concept of APIs than my fellow Head Geek Patrick Hubbard—and he did not disappoint. Fully embracing the “Thing Explainer” concept—one of the sources of inspiration for the challenge this year—Patrick’s explanation of “Computer-Telling Laws” is a thing of beauty.

 

Tregg Hartley Dec 15, 2019 11:37 AM

I click on an icon

You take what I ask,

Deliver it to

The one performing the task.

 

When the task is done

And ready for me,

You deliver it back

In a way I can see.

 

You make my life easier

I can point, click and go,

You’re the unsung hero

and the star of the show.

 

Vinay BY  Dec 16, 2019 5:45 AM

API to me is a way to talk to a system or an application/software running on it, while we invest a lot of time in building that we should also make sure it’s built with standards and rules/laws in mind. Basically we shouldn’t be investing a lot of time on something that can’t be used.

 

Dale Fanning Dec 16, 2019 9:36 AM

In many ways an API is a lot like human languages. Each computer/application usually only speaks one language. If you speak in that language, it understands what you want and will do that. If you don’t, it won’t. Just like in the human world, there are translators for computers that know both languages and can translate back and forth between the two so each can understand the other.

 

16. SNMP

Even though “simple” is part of its name, understanding SNMP can be anything but. THWACK MVP Craig Norborg does a great job of breaking it down to its most essential ideas.

 

Jake Muszynski  Dec 16, 2019 7:16 AM

SNMP still is relevant after all these years because the basics are the same on any device with it. Most places don’t have just one vendor in house. They have different companies. SNMP gets out core monitoring data with very little effort. Can you get more from SNMP with more effort? Probably. Can other technologies get you real time data for specialty systems? Yup, there is lots of stuff companies don’t put in SNMP. But that’s OK. Right up there with ping, SNMP is still a fundamental resource.

 

scott driver Dec 16, 2019 1:38 PM

SNMP: Analogous to a phone banking system (these are still very much a thing btw).

 

You have a Financial Institution (device)

You call in to an 800# (an oid)

If you know the right path you can get your balance (individual metric)

 

However when things go wrong, the fraud department will reach out to you (Trap)

 

Tregg Hartley Dec 17, 2019 12:10 PM

Sending notes all of the time

For everything under the sun,

The task is never ending

And the Job is never done.

 

I can report on every condition

I send and never look back,

My messages are UDP

I don’t wait around for the ACK.

 

17. Syslog

What does brushing your teeth have to do with an almost 30-year-old messaging protocol? Only a true teacher—in this case the inimitable “RadioTeacher” (THWACK MVP Paul Guido)—could make something so clear and simple.

 

Faz f Dec 17, 2019 4:54 AM

Like a diary for your computer

 

Juan Bourn Dec 17, 2019 11:24 AM

A way for your computer/server/application to tell you what it was doing at an exact moment in time. It’s up to you to determine why, but the computer is honest and will tell you what and when.

 

18. Parent-Child

For almost 20 days, we’ve seen some incredible explanations for complex technical concepts. But for day 18, THWACK MVP Jez Marsh takes advantage of the concept of “Parent-Child” to remind us our technical questions and challenges often extend to the home, but at the end of the day we can’t lose sight of what’s important in that equation.

 

Jeremy Mayfield  Dec 18, 2019 7:41 AM

Thank you, this is great. I think of the parent-Child as one is present with the other. As the child changes the parent becomes more full, and eventually when the time is right the child becomes a parent and the original parent may be no more.

 

The Parent-Child Relationship is one that nurtures the physical, emotional and social development of the child. It is a unique bond that every child and parent will can enjoy and nurture. ... A child who has a secure relationship with parent learns to regulate emotions under stress and in difficult situations.

 

Dale Fanning Dec 18, 2019 8:36 AM

I’m a bit further down the road than you, having launched my two kids a few years ago, but I will say that the parent-child relationship doesn’t change even then, although I count them more as peers than children now. I’m about to become a grandparent for the first time, and our new role is helping them down the path of parenthood without meddling too much hopefully. It’s only much later that you realize how little you knew when you started out on the parent journey.

 

Chris Parker Dec 18, 2019 9:37 AM

In keeping with the IT world:

 

This is the relationship much like you and your parent.

 

You need your parents/guardians in order to bring you up in this world and without them you might be ‘orphaned’

Information on systems sometimes need a ‘Parent’ in order for the child to belong

You can get some information from the Child but you would need to go to the Parent to know where the child came from

One parent might have many children who then might have more children but you can follow the line all the way to the beginning or first ‘parent’

 

19. Tracing

I’ve mentioned before how LEGO bricks just lend themselves to these ELI5 explanations of technical terms, especially as it relates to cloud concepts. In this case, Product Marketing Manager Peter Di Stefano walks through the way tracing concepts would help troubleshoot a failure many of us may encounter this month—when a beloved new toy isn’t operating as expected.

 

Chris Parker Dec 19, 2019 4:57 AM

Take apart the puzzle until you find the piece that is out of place

 

Duston Smith Dec 19, 2019 9:26 AM

I think you highlight an important piece of tracing—documentation! Just like building LEGOs, you need a map to tell you what the process should be. That way when the trace shows a different path you know where the problem sits.

 

Holly Baxley Dec 19, 2019 10:15 AM

Hey Five-year-old me,

 

Remember when I talked about Event Correlation a while back and told you that it was like dot to dot, because all the events were dots and if you connected them together you can see a clearer “picture” of what’s going on?

 

Well, today we’re going to talk about Tracing, which “seems” like the same thing, but it isn’t.

 

See in Event Correlation you have no clue what the picture is. Event Correlation’s job is to connect events together, so it can create as clear a picture as it can of the events to give you an outcome. Just remember, Event Correlation is only as good as the information that’s provided. If events (dots) are left out—the picture is still incomplete, and it takes a little work to get to the bottom of what’s going on.

 

In tracing—you already know what the picture is supposed to look like.

 

Let’s say you wanted to draw a picture of a sunflower.

 

Your mom finds a picture of the sunflower on the internet and she prints it off for you.

 

Then she gives you a piece of special paper called “vellum” that’s just the right amount of opaque (a fancy term for see-through) paper, so you can still see the picture of the sunflower underneath it. She gives you a pencil so you can start tracing.

 

Now in tracing does it matter where you start to create your picture?

 

No it doesn’t.

 

You can start tracing from anywhere.

 

In dot-to-dot, you can kinda do the same thing if you want to challenge yourself. It’s not always necessary to start at dot 1, and if you’re like me (wait...you are me)...you rarely find dot 1 the first time anyway. You can count up and down to connect the dots and eventually get there.

 

Just remember—in this case, you still don’t know what the picture is and that’s the point of dot to dot—to figure out what the picture is going to be.

 

In tracing—we already know what the picture either is, or at least is supposed to look like.

 

And just like in tracing, once you lift your paper off the picture, you get to see—did it make the picture that you traced below?

 

If it didn’t—you can either a) get a new sheet and try again or b) start with where things got off track and erase it and try again.

 

To understand tracing in IT, I want you to think about an idea you’ve imagined. Close your eyes. Imagine your picture in your mind. Do you see it?

 

  1. Good.

 

We sometimes say that we can “picture” a solution, or we “see” the problem, when in reality, a problem can be something that we can’t really physically see. It’s an issue we know is out there: e.g., the network is running slow and we see a “picture” of how to fix it in our mind; a spreadsheet doesn’t add up right like it used to, and we have a “picture” in our mind of how it’s supposed to behave and give the results we need.

 

But we can’t physically take a piece of paper and trace the problem.

 

We have programs that trace our “pictures’ for us and help us see what went right and what went wrong.

 

Tracing in IT is a way to see if your program, network, spreadsheet, document, well...really anything traceable did what it was supposed to do and made the “picture” you wanted to see in the end.

 

It’s a way to fix issues and get the end result you really want.

 

Sometimes we get our equipment and software to do what it’s supposed to, but then we realize—it could be even BETTER, and so we use tracing to figure out the best “path” to take to get us there.

 

That would be like deciding you want a butterfly on your Sunflower, so your mom prints off a butterfly for you and you put your traced Sunflower over the butterfly and then decide what’s the best route to take to make your butterfly fit on your sunflower the way you want it.

 

And just like tracing—sometimes you don’t have to start at the beginning to get to where you want to be.

 

If you know that things worked up to a certain point but then stopped working the way you want, you can start tracing right at the place where things aren’t working the way you want. You don’t always have to start from a beginning point. This saves time.

 

There’s lots of different types of tracing in IT. Some people trace data problems on their network, some people trace phone problems on their network, some trace document and spreadsheet changes on their files, some trace database changes. There’s all sorts of things that people can trace in IT to either fix a problem or make something better.

 

But the end question of tracing is always the same.

 

Did I get what I “pictured?”

 

And if the answer is “yes” - we stop and do the tech dance of joy.

 

It’s a secret dance.

 

Someday, I’ll teach you.

 

20. Information Security

THWACK MVP Peter Monaghan takes a moment to simply and clearly break down the essence of what InfoSec professionals do, and to put it into terms that parents would be well-advised to use with their own kids.

 

(while I don’t normally comment on the comments, I’ll make an exception here)

In the comments, a discussion quickly started about whether using this space to actually explain infosec (along with the associated risks) TO a child was the correct use of the challenge. While the debate was passionate and opinionated, it was also respectful and I appreciated that. Thank you for making THWACK the incredible community that it has grown to be!

 

Holly Baxley Dec 20, 2019 3:18 PM (in response to Jeremy Mayfield)

I think Daddy's been reading my diary

He asks if I'm okay

Wants to know if I want to take walks with him

Or go outside and play

 

He tells Mommy that he's worried

There's something wrong with me

Probably from reading things in the diary

Things he thinks he shouldn't see

 

But I'll tell you a little secret

That diary isn't real

I scribbled nonsense in that journal

And locked away the one he can't steal

 

If Daddy was smart he woulda noticed

Something he's clearly forgot

Never read a diary covered with Winnie the Poo

Whose head is stuck in the Honeypot.

 

Jon Faldmo Dec 20, 2019 1:27 PM

I haven't thought of how information security applies or is in the same category as privacy and being secure online. I have always thought of Information Security in the context of running a business. It is the same thing, just usually referenced differently. Thanks for the write up.

 

Tregg Hartley Dec 20, 2019 3:33 PM

The OSI model

Has seven layers,

But it leaves off

The biggest players.

 

The house is protected

By the people inside,

We are all on watch

As such we abide.

 

To protect our house

As the newly hired,

All the way

To the nearly retired.

Introduction

OK, so the title is hardly original, apologies. But it does highlight the buzz for Kubernetes still out there not showing any signs of going away anytime soon.

 

Let’s start with a description of what Kubernetes is:

 

Kubernetes is a portable, extensible, open-source platform for managing containerized workloads and services, that facilitates both declarative configuration and automation. It has a large, rapidly growing ecosystem. Kubernetes services, support, and tools are widely available¹

 

Let me add my disclaimer here. I’ve never used Kubernetes or personally had a use case for it. I have an idea of what it is and its origins (do I need a Borg meme as well?) but not much more.

 

A bit of background on myself: I’m predominantly an IT operations person, working for a Value-Added Reseller (VAR) designing and deploying VMware based infrastructures. The organizations I work with are typically 100 – 1000 seats in size across many vertical markets. I would be naive to think none of those organizations aren’t thinking about using containerization and orchestration technology, but genuinely none of them currently are.

 

Is It Really a Thing?

In the past 24 months, I’ve attended many trade shows and events, usually focused around VMware technology, and it’s always asked, “Who is using containers?” The percentage of hands going up is always less than 10%. Is it just the audience type or is this a true representation of container adoption?

 

Flip it around and when I go to an AWS or Azure user group event, it’s the focus and topic of conversation: containers and Kubernetes. So, who are the people at these user groups? Predominantly the developers! Different audiences, different answers.

 

I work with one of the biggest Tier-1 Microsoft CSP distributors in the U.K. Their statistics on Azure consumption by type of resource are enlightening. 49% of billable Azure resources are virtual machines, 30-odd% is object storage consumption. There was a small slice of the pie at 7% for misc. services, including AKS (Azure Kubernetes Service). This figure aligns with my first observation at trade events, where less than 10% of people in the room were using containers. I don’t know if those virtual machines are running container workloads.

 

Is There a Right Way?

This brings us to the question and part of the reason I wrote this article: is there a right way to deploy containers and Kubernetes? Every public cloud has its own interpretation—Azure Kubernetes Service, Amazon EKS, Google Kubernetes Engine, you get the idea. Each one has its own little nuances capable of breaking the inherent idea behind containers: portability. Moving from one cloud to another, the application stack isn’t necessarily going to work right away.

 

Anyways, the interesting piece for me, because of my VMware background, is Project Pacific. Essentially, VMware has gone all-in embracing Kubernetes by making it part of the vSphere control plane. IT Ops can manage a Kubernetes application container in the same way they can a virtual machine, and developers can consume Kubernetes in the same way they can elsewhere. It’s a win/win situation. In another step by VMware to become the management plane for all people, think public cloud, on-premises infrastructure, and the software designed data center, Kubernetes moves ever closer to my wheelhouse.

 

No matter where you move the workload, if VMware is part of the management and control plane, then user experience should be the same, allowing for true workload mobility.

 

Conclusion

Two things for me.

 

1. Now more than ever seems like the right time to look at Kubernetes, containerization, and everything it brings.

2. I’d love to know if my observations on containers and Kubernetes adoption are a common theme or if I’m living with my head buried in the sand. Please comment below.

 

¹ Kubernetes Description - https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/

I visited the Austin office this past week, my last trip to SolarWinds HQ for 2019. It’s always fun to visit Austin and eat my weight in pork products, but this week was better than most. I took part in deep conversations around our recent acquisition of VividCortex.

 

I can’t begin to tell you how excited I am for the opportunity to work with the VividCortex team.

 

Well, maybe I can begin to tell you. Let’s review two data points.

 

In 2013, SolarWinds purchased Confio Software, makers of Ignite (now known as Database Performance Analyzer, or DPA) for $103 million. That’s where my SolarWinds story begins, as I was included with the Confio purchase. I was with Confio since 2010, working as a sales engineer, customer support, product development, and corporate marketing. We made Ignite into a best of breed monitoring solution that’s now the award-winning, on-prem and cloud-hosted DPA loved by DBAs globally.

 

The second data point is from last week, when SolarWinds bought VividCortex for $117.5 million. One thing I want to make clear is SolarWinds just doubled down on our investment in database performance monitoring. Anyone suggesting anything otherwise is spreading misinformation.

 

Through all my conversations last week with members of both product teams one theme was clear. We are committed to providing customers with the tools necessary to achieve success in their careers. We want happy customers. We know customer success is our success.

 

Another point that was made clear is the VividCortex product will complement, not replace DPA, expanding our database performance monitoring portfolio in a meaningful way. Sure, there is some overlap with MySQL, as both tools offer support for that platform. But the tools have some key differences in functionality. Currently, VividCortex is a SaaS monitoring solution for popular open-source platforms (PostgreSQL, MySQL, MongoDB, Amazon Aurora, and Redis). DPA provides both monitoring and query performance insights for traditional relational database management systems and is not yet available as a SaaS solution.

 

This is why we view VividCortex as a product to enhance what SolarWinds already offers for database performance monitoring. We’re now stronger this week than we were just two weeks ago. And we’re now poised to grow stronger in the coming months.

 

This is an exciting time to be in the database performance monitoring space, with 80% of workloads still Earthed. If you want to know about our efforts regarding database performance monitoring products, just AMA.

 

I can't wait to get started on helping build next-gen database performance monitoring tools. That’s what VividCortex represents, the future for database performance monitoring, and why this acquisition is so full of goodness. Expect more content in the coming weeks from me regarding our efforts behind the scenes with both VividCortex and DPA.

As we head into the new year, people will once again start quoting a popular list describing the things kids starting college in 2020 will never personally experience. Examples of these are things like “They’re the first generation for whom a ‘phone’ has been primarily a video game, direction finder, electronic telegraph, and research library.” And “Electronic signatures have always been as legally binding as the pen-on-paper kind.” Or most horrifying, “Peanuts comic strips have always been repeats.”

 

That said, it’s also interesting to note the things fell into obsolescence over the last few decades. In this post, I’m going to list and categorize them, and add some of my personal thoughts about why they’ve fallen out of vogue, if not use.

 

It’s important to note many of these technologies can still be found “in the wild”—whether because some too-big-to-fail, mission-critical system depends on it (c.f. the New York Subway MetroCard system running on the OS/2 operating system—https://www.vice.com/en_us/article/zmp8gy/the-forgotten-operating-system-that-keeps-the-nyc-subway-system-alive); or because devotees of the technology keep using it even though newer, and ostensibly better, tech has supplanted it (such as laserdiscs and the Betamax tape format*).

 

Magnetic Storage

This includes everything from floppy disks (whether 10”, 8”, 5.25”, or 3.5”), video tapes (VHS or the doubly obsolete** Betamax), DAT, cassette tapes or their progenitor reel-to-reel, and so on.

 

The reason these technologies are gone is because they weren’t as good as what came after. Magnetic storage was slow, prone to corruption, and often delicate and/or difficult to work with. Once a superior technology was introduced, people abandoned these as fast as they could.

 

Disks for Storage

This category includes the previously-mentioned floppy disks, but extends to include CDs, DVDs, and the short-lived mini-disks. All have—by and large—fallen by the wayside.

 

The reason for this is less because these technologies were bad and/or hard to use, per se (floppies notwithstanding) but because what came after—flash drives, chip-based storage, SSD, and cloud storage, to name a few—were so much better.

 

Mobile Communications

Since the introduction of the original cordless phone in 1980, mobile tech has become both ubiquitous and been an engine of societal and technological change. But not everything invented has remained with us. Those cordless phones I mentioned are a good example, as are pagers and mobile phones that are JUST phones and nothing else.

 

It’s hard to tell how much of this is because the modern smartphone was superior to its predecessors, and how much was because the newest tech is so engaging—both in terms of the features it contains and the social cachet it brings.

 

Portable Entertainment

Once a juggernaut in the consumer electronics sector, the days of Walkman, Discman, and portable DVD players has largely ended.

 

In one of the best examples of the concept of “convergence,” smartphone features have encompassed and made obsolete the capabilities once performed by any and all those mobile entertainment systems.

 

School Tech

There was a range of systems which were staples in the classroom until relatively recently: if the screen in the classroom came down, students might turn their attention to information emanating from an overhead projector, a set of slides, a filmstrip, or even an actual film.

 

Smartboards, in-school media servers, and computer screen sharing all swooped in to make lessons far more dynamic, interactive, and (most importantly) simple for the teacher to prepare. And no wonder, since no teacher in their right mind would go back to the long hours drawing overhead cells in multiple marker colors, only to have that work destroyed by a wayward splash of coffee.

 

A Short List of More Tech We Don’t See (Much) Any More:

  • CRT displays
  • Typewriters
  • Fax machines (won’t die, but still)
  • Public phones
  • Folding maps
  • Answering machines

What other tech or modern conveniences of a bygone era do you miss—or at least notice is missing? Talk about it in the comments below.

 

* Ed. note: Betamax was far superior, especially for TV usage, until digital records became commercially acceptable from a budget perspective, thankyouverymuch. Plus, erasing them on the magnet thingy was fun.

** Ed. note: Rude.

I hope this edition of the Actuator finds you and yours in the middle of a healthy and happy holiday season. With Christmas and New Year's falling on Wednesday, I'll pick this up again in 2020. Until then, stay safe and warm.

 

As always, here's a bunch of stuff I found on the internet I thought you might enjoy.

 

Why Car-Free Streets Will Soon Be the Norm

I'm a huge fan of having fewer cars in the middle of any downtown city. I travel frequently enough to European cities and I enjoy the ability to walk and bike in areas with little worry of automobiles.

 

Microsoft and Warner Bros trap Superman on glass slide for 1,000 years

Right now, one of you is reading this and wondering how to monitor glass storage and if an API will be available. OK, maybe it's just me.

 

The trolls are organizing—and platforms aren't stopping them

This has been a problem with online communities since they first started; it's not a new problem.

 

New Orleans declares state of emergency following cyberattack

Coming to a city near you, sooner than you may think.

 

Facebook workers' payroll data was on stolen hard drives

"Employee wasn’t supposed to take hard drives outside the office..." Security is hard because people are dumb.

 

A Sobering Message About the Future at AI's Biggest Party

The key takeaway here is the discussion around how narrow the focus is for specific tasks. Beware the AI snake oil salesman promising you their algorithms and models work for everyone. They don't.

 

12 Family Tech Support Tips for the Holidays

Not a bad checklist for you to consider when your relatives ask for help over the holidays.

 

Yes, I do read books about bacon. Merry Christmas, Happy Holidays, and best wishes.

Omar Rafik, SolarWinds Senior Manager, Federal Sales Engineering

 

Here’s an interesting article by Jim Hansen about leveraging access rights management to reduce insider threats and help improve security.

 

According to the SolarWinds 2019 Cybersecurity Survey, cybersecurity threats are increasing—particularly the threat of accidental data exposure from people inside the agency.

 

According to the survey, 56% of respondents said the greatest source of security threats to federal agencies is careless and/or untrained agency insiders; 36% cited malicious insiders as the greatest source of security threats. Nearly half of the respondents—42%—say the problem has gotten worse or has remained a constant battle.

 

According to the survey, federal IT pros who have successfully decreased their agency’s risk from insider threats have done so through improved strategy and processes to apply security best practices.

 

While 47% of respondents cited end-user security awareness training as the primary reason insider threats have improved or remained in control, nearly the same amount—45%—cited network access control as the primary reason for improvement, and 42% cited intrusion detection and prevention tools.

 

The lesson here is good cyberhygiene in the form of access management can go a long way toward enhancing an agency’s security posture. Certain aspects of access management provide more protection than others and are worth considering.

 

Visibility, Collaboration, and Compliance

 

Every federal IT security pro should be able to view permissions on file servers to help identify unauthorized access or unauthorized changes to more effectively prevent data leaks. Federal IT pros should also be able to monitor, analyze, and audit Active Directory and Group Policy to see what changes have been made, by whom, and when those changes occurred.

 

One more thing: be sure the federal IT team can analyze user access to services and file servers with visibility into privileged accounts and group memberships from Active Directory and file servers.

 

Collaboration tools—including SharePoint and Microsoft Exchange—can be a unique source of frustration when it comes to security and, in particular, insider threats. One of the most efficient ways to analyze and administer SharePoint access rights is to view SharePoint permissions in a tree structure, easily allowing the user to see who has authorized access to any given SharePoint resource at any given time.

 

To analyze and administer Exchange access rights, start by setting up new user accounts with standardized role-specific templates to provide access to file servers and Exchange. Continue managing Exchange access by tracking changes to mailboxes, mailbox folders, calendars, and public folders.

 

Finally, federal IT pros know while managing insider threats is of critical importance, so is meeting federal compliance requirements. Choose a solution with the ability to create and generate management and auditor-ready compliance reports showing user access rights, as well as the ability to log activities in Active Directory and file servers by user.

 

Conclusion

 

There are options available to dramatically help the federal IT security pro get a better handle on insider threats and go a long way toward mitigating risks and keeping agency data safe.

 

Find the full article on our partner DLT’s blog Government Technology Insider.

 

The SolarWinds trademarks, service marks, and logos are the exclusive property of SolarWinds Worldwide, LLC or its affiliates. All other trademarks are the property of their respective owners.

I was in the pub recently for the local quiz and afterwards, I got talking to someone I hadn’t seen for a while. After a few minutes, we started discussing a certain app he loves on his new phone, but he wished the creators would fix a problem with the way it displayed information, so it looks like it does when he logs in on a web browser.

 

“It’s all got to do with technical debt,” I blurted out.

 

“What?” he replied.

 

“When they programmed the app, the programmers took an easier method rather than figure out how to display the details the same way as your browser to be able to ship it quicker to you, the consumer, and have yet to repay the debt. It’s like a credit card.”

 

It’s fine to have some technical debt, like having an outstanding balance on a credit card, and sometimes you can pay off the interest, i.e., apply a patch; but there comes a point when you need to pay off the balance. This is when you need to revisit the code and implement a section properly; and hence pay off the debt.

 

There are several reasons you accrue technical debt, one of which is lack of experience and inferior skills by the coding team. If the team doesn’t have the right understanding or skills to solve the problem, it’ll only get worse.

 

How can you help solve this? I’m a strong proponent of the education you can glean from attending a conference, whether it be Kubecon, Next, DEFCON, or AWS re:Invent, which I just attended. These are great places to sit down and discuss things with your peers, make new friends, discover fresh GitHub repositories, learn from experts, and hear about new developments in the field, possibly ahead of their release, which may either give you a new idea or help solve an existing problem. Another key use case for attending is the ability to provide feedback. Feedback loops are a huge source of information for developers. Getting actual customer feedback, good or bad, helps shape the short-term to long-term goals of a project and can help you understand if you’re on the right path for your target audience.

 

So, how do you get around these accrued debts? First, you need to have a project owner whose goal is to make sure the overall design and architecture is adhered to. It should also be their job to make sure coding standards are adhered to and documentation is created to accompany the project. Then with the help of regression testing and refactoring over time, you’ll find problems and defects in your code and be able to fix them. Any rework from refactoring needs to be planned and assigned correctly.

 

There are other ways to deal with debt, like bug fix days and code reviews, and preventative methods like regular clear communication between business and developer teams, to ensure the vision is implemented correctly and it delivers on time to customers.

 

Another key part of dealing with technical debt is taking responsibility and everyone involved with the project being aware of where they may have to address issues. By being open rather than hiding the problem, it can be planned for and dealt with. Remember, accruing some technical debt is always going to happen—just like credit card spending.

Scenario: a mission-critical application is having performance issues during peak business hours. App developers blame the storage. The storage team blames the network. The network admin blames the infrastructure. The cycle of blame continues until finally someone shouts, “Why don’t we just put it in the cloud?” Certainly, putting the application into the public cloud will solve all these issues, right? Right?! While this might sound like a tempting solution, just simply installing an application on server in the public cloud may not resolve the problem—it might open the company to more unforeseen issues.

 

Failure to Plan Is Planning to Fail

 

The above adage is one of the biggest roadblocks to successful cloud migrations. Often when an application is looked at to be moved to the cloud, the scope of its interactions with servers, networks, and databases isn’t fully understood. What appears to be a Windows Server 2016 box with four vCPU and 16Gb RAM running an application turns out to be an interconnected series of SQL Server instances, Apache Web Servers, load balancers, application servers, and underlying data storage. If this configuration is giving your team performance issues on your on-premises hardware, why would moving it to hardware in a different data center resolve the problem?

 

If moving to the cloud is a viable option at this juncture of your IT strategy, it’s also time to consider refactoring the application into a more cloud-native format. What is cloud-native? Per the Cloud Native Computing Foundation (CNCF), the definition of cloud-native is:

 

“(Cloud-native) technologies empower organizations to build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds. Containers, service meshes, microservices, immutable infrastructure, and declarative APIs exemplify this approach.

 

These techniques enable loosely coupled systems that are resilient, manageable, and observable. Combined with robust automation, they allow engineers to make high-impact changes frequently and predictably with minimal toil.”

 

Cloud-native applications have been developed or refactored to use heavy automation, use containers for application execution, are freed from operating system dependencies, and present elastic scalability traditional persistent virtual servers cannot provide. Applications become efficient not only in performance, but in cost as well with this model. However, refactoring an application to a cloud-native state can take lots of time and money to make the transition.

 

The Risks of Shadow IT

 

If you’ve taken the time to understand the application dependencies, a traditional application architecture can be placed in a public cloud while an app is refactored to help alleviate some issues. But again, the process can be time-consuming. Administrators can grow impatient during these periods, or if their request for additional resources have been denied, can grow frustrated. The beautiful thing about public clouds is the relative ease of entry into services. Any Joe Developer with a credit card can fire up an AWS or Azure account on their own and have a server up and running within a matter of minutes by following a wizard.

 

Cool, my application is in the cloud and I don’t have to wait for the infrastructure teams to figure out the issues. Problem solved!

 

Until an audit finds customers’ credit card data in an AWS S3 bucket open to the public. Or when the source of a ransomware outbreak is traced back to an unsecured Azure server linked to your internal networks. Oh, and let’s not even discuss the fact an employee is paying for these services outside of the purview of the accounting department or departmental budgets (which is a topic for another blog post later in this series).

 

Security and compliance can be achieved in the cloud, but much like before, it comes down to planning. By default, many public cloud services aren’t locked down to corporate or compliance standards. Unfortunately, this information isn’t widely known or advertised by the cloud vendors. It’s on the tenant to make sure their deployments are secure and the data is backed up. Proper cloud migration planning involves all teams of the business’s IT department, including the security team. Everyone should work together to make sure the cloud architecture is designed in a way allowing for performance, scalability, and keeping all data secure.

 

Throwing an application at the cloud isn’t a fix for poor architecture or aging technologies. It can be a valuable tool to help in the changing world of IT, but without proper planning it can burn your company in the end. In the next post in the “Battle of the Clouds” series, we’ll look at determining the best location for the workload and how to plan for these moves.

Week 2 of the challenge has brought even more insights and wisdom than I imagined - although I should have expected it, given how incredible the THWACK community is day after day, year in and year out. As a reminder, you can find all the posts here: December Writing Challenge 2019.

 

I also wanted to take a moment to to talk about the flexibility of the ELI5 concept. If you have a child, or have been around a child, or ever were a child, you’re probably acutely aware no two kids are exactly alike. Therefore, “Explain Like I’m Five” (ELI5) implicitly allows for a range of styles, vocabularies, and modalities. Like some of the best ideas in IT (or at least the ones making the most impact), there’s not a single, correct way to “do” explain-it-simply. ELI5 is not a single standard, it’s a framework, a way of approaching a task. Explanations can use simple words; or present simple concepts using more sophisticated words; or use examples familiar to a child; or even be presented in pictures instead of words. Because the best thing about explaining something simply is there are many ways to do it.

 

With that said, here are the featured words words and lead writers for this week, and some of the notable comments from each day.

 

 

7. Troubleshoot

Kicking off the second week of the challenge, THWACK MVP Nick Zourdos tackles one of the most common tasks in IT—one of the things we most hate to do, and yet also one of the skills we take most pride in.

 

Jake Muszynski  Dec 7, 2019 6:53 PM

In IT the ability to troubleshoot problems will set you apart. So many people I have worked with go in circles or have no idea how to move forward to resolve issues. Starting with ruling out the things that are right, and listing what you don’t know goes a long way to a resolution.

 

Tregg Hartley Dec 8, 2019 4:33 PM

Understanding how things work

Is at the very core,

Of knowing how to troubleshoot

And doing well, this chore.

 

 

Knowing which tools to use

Will also help with this,

To localize the issue

And return to cyber bliss.

 

Thomas Iannelli  Dec 10, 2019 11:46 AM

SUZIE: Uncle Tom?

TOM: Yes Suzie.

SUZIE: Mom says you troubleshoot computers. What’s troubleshoot?

TOM: Well Suzie, see Alba over there?

SUZIE: uh huh

TOM: See how she is just laying there?

SUZIE: uh huh

TOM: Is she sleeping or dead?

SUZIE: UNCLE TOM! Alba is NOT DEAD!

TOM: How can you tell she is not dead?

SUZIE: I can see her chest moving.

TOM: What else?

SUZIE: When I squeak this toy her head will pop up, watch.

[Suzie squeaks the toy, but Alba doesn’t move.]

TOM: Oh, no Suzie Alba didn’t move. What next?

SUZIE: I’ll give her a treat.

[Suzie repeatedly says Alba’s name and offers a treat, but Alba is not interested.]

TOM: Oh, no Suzie Alba didn’t move again! I think I know a good way to test if she is still alive.

TOM: Hey, Alba do you want to go for a ride?

[At which point Alba jumps up, almost knocking Suzie over, and heads toward the garage door.]

TOM: You see Suzie, troubleshooting is like trying to answer the question whether Alba was alive or dead. It is a problem to be solved. You did very good things to find out if she was alive and kept trying. Sometimes it just takes someone with a little more experience who knows the right question to ask or thing to do in order to solve a problem. That is the same thing I do when I troubleshoot computers. But see next time you will know to simply ask if Alba wants to go for a ride, we all learn from each other.

SUZIE: Uncle Tom, I also learned not to get between Alba and the garage door when you ask her if she wants to go for a ride!

[They both laugh and go take Alba for a ride around the neighborhood. Otherwise she will stand by the garage door barking for the next 30 minutes, definitely letting everyone know she is alive.]

 

  1. 8. Virtualization

The second word of the week has—as many of the commenters said—completely changed the nature of IT for many of us. SolarWinds SE Colin Baird gives a simple, but not simplistic, explanation of what and how this technology has been so transformative.

 

Faz f Dec 9, 2019 4:10 AM

I have a very big Cardboard box, too big for me, cardboard, Scissors and sellotape.

My friend comes and also wants a box,

I get the cardboard, scissors and sellotape and make my friend a box inside my box,

My box is still too big for me.

 

Another friend comes who wants a box.

I get the cardboard, scissors and sellotape and make my friend another box inside my box,

Next to my first box.

My box is still too big for me.

 

Another friend comes who wants a box.

I get the cardboard, scissors and sellotape and make my friend another box inside my box,

Next to the other boxes.

I think my box is now just right for me,

My friends are having fun in their Visualisation of a box.

 

George Sutherland Dec 9, 2019 8:29 AM

The pie analogy is perfect. Virtualization is the natural progression of computing....

I also think that virtualization is “divide and conquer” a large box can support a number of smaller boxes, each solving a needed business problem.

 

scott driver Dec 9, 2019 12:01 PM

Thank you for getting back to the ELI5 approach.

 

Virtualization: Computers running inside other computers

 

  1. 9. Cloud Migration

THWACK MVP Holger Mundt kicks of a series of days focusing not only on cloud-based technologies and techniques, but also featuring those little plastic blocks kids (of all ages) love to play with to build new things, worlds, and dreams.

 

Chris Parker Dec 9, 2019 3:29 AM

All your precious items

Saved at home

Under your care, in your hands

 

But in time there are too many

Not enough space

A single collection

A risk, danger

 

A solution, though not always best

Someone else to take care for you

The burden lifted from your hands

A Gringotts in the sky

 

A cost attached

But sometimes needed

Safest option to suit most needs

 

But be warned

The goblins can be tricky

The cloud unmanaged

A cost too big

 

Control passed over

Hard to return

 

Sascha Giese  Dec 9, 2019 3:51 AM

Not gonna migrate my LEGO Super Star Destroyer!

 


Michael Perkins
Dec 9, 2019 8:50 AM

I am old enough (barely) to remember when computers were usually big machines in central locations accessed via dumb terminals. The big machine’s owner or administrator sold or doled out resources to you: storage, processor time, etc. I grew up through the PC revolution—the first box on which I actually worked was a 6k Commodore PET (one for the whole school), followed quickly by an Apple IIe (one in each classroom). My first home PC was the 128k Mac—the same one advertised on the ‘1984’ Super Bowl ad. I’ve used various flavors of DOS, Linux/UNIX, MacOS, and Windows through grade school, high school, undergraduate and graduate work, home and employment.

 

Now, everyone is migrating to the cloud. The big machine at the other end is a lot more complex: more redundant, better connected, faster. It offers additional services than the old ones, at least if you purchase the right ‘aaS.’ At its heart though, we are going back to paying for processor cycles, storage, and connectivity to it.

 

Everything old is new again.

 

  1. 10. Container

David Wagner is one of the product managers for the team building and supporting SolarWinds solutions for the cloud, so it makes sense for him to tackle this word.

 

Kelsey Wimmer Dec 10, 2019 12:21 PM

In some ways, it’s like keeping the forks, knives, and spoons in one drawer that has dividers rather than keeping forks, knives, and spoons in different drawers. That last part sounds silly, but that’s exactly what people who developed containers thought.

 

Rick Schroeder  Dec 10, 2019 4:52 PM

Some containers let us manage many smaller items that are put into groups, and it’s a huge time-saver, and very powerful. Rather than contacting 100,000 soldiers individually, one might contact “The army” container. Or one of several Corps, Divisions, Brigades or Regiments, Battalions, Companies, Platoons, right down to squads. Managing by containers, or by groups, is part of what makes Active Directory powerful—or ridiculously complex and inefficient, depending on one’s great planning and experience—or the lack thereof.

 

Other containers are computer environments that are isolated from other systems, and that allow us to execute commands without impacting resources that should NOT be disturbed. Containers can make installing/running apps on a Linux server simpler and more uniform. And that makes for faster deployment and better security.

 

Matt R  Dec 11, 2019 10:31 AM

Ha, this is perfect. My child has a specific definition of containers, as well. We had this conversation last year:

 

(daughter): Mommy, will you sit in the trash can (next to potty) while I go potty?

(mom): People don’t sit in the trash

(daughter): Except for when they die, then we throw them in the trash

(mom): We don’t throw dead people away

(daughter): Oh, only animals. What do we do with dead people?

 

So, be careful what you define as a container or it may end up with some...unwanted results.

 

Laura Desrosiers Dec 11, 2019 11:49 AM

I keep everything as neat, clean and simple as possible. I don’t like to over complicate things and everything has its place.

 

  1. 11. Orchestration

Another day of cloud-based topics, and product manager Dave Wagner is back to explain how yesterday’s word and todays fit together to create a more automated environment.

 

Anthony Hoelscher Dec 11, 2019 12:22 PM

Another way to imagine this is baking a cake. It’s awfully hard to find a substitute for an egg when you are out. All the ingredients must be added within a certain time to be effective. There are certain sub tasks that must be completed to achieve a delicious cake, you beat the egg before you add it to your working recipe, and you always crack it open, careful not to lose any shell in the batter.

Everything has its place, and recipes help achieve the same result, don’t leave out the eggs.

 

Holly Baxley Dec 11, 2019 12:59 PM (in response to Dave Wagner)

Workflow: Mom’s before-bed-to-do-list

Orchestration: Mom directing all of us to do our tasks before bed

 

Jason Scobbie Dec 11, 2019 12:46 PM

Automation is a great thing... Combining these tasks and process through orchestration is the difference between fixing things for an Engineer or small team to turning it into an Enterprise wide improvement. When you can automate a change, but also the change ticket, taking the device in/out of monitoring, pre/post change verification, and NOC notification all by a single click to start is a key to greatness.

 

  1. 12. Microservices

For this cloud-centric term, SolarWinds product manager Melanie Achard once again invoked the (practically) holy LEGO concept, to great effect.

 

Jeremy Mayfield  Dec 12, 2019 8:33 AM

Of course I am a fan of the Lego analogies. Great way to explain this. Just to be different today, right from Google: The honeycomb is an ideal analogy for representing the evolutionary microservices architecture. In the real world, bees build a honeycomb by aligning hexagonal wax cells. They start small, using different materials to build the cells. Construction is based on what is available at the time of building. Repetitive cells form a pattern and result in a strong fabric structure. Each cell in the honeycomb is independent but also integrated with other cells. By adding new cells, the honeycomb grows organically to a big, solid structure. The content inside each cell is abstracted and not visible outside.

 

Kelsey Wimmer Dec 12, 2019 9:27 AM

A microservice is a small program that does one job but does it really well. It doesn’t try to do everything. Just its job. It needs to communicate with other programs but it doesn’t do their jobs. You can put a bunch of microservices together and do even bigger things.

 

Holly Baxley Dec 12, 2019 10:54 AM

Hey Five-year-old me,

 

Do you remember the Power Rangers? How cool they are? Remember how you always wished you were the Pink Ranger, even though you were told the Green Ranger was always the strongest? You thought gymnastic skills could kick butt over raw brawn any day.

 

Well, keep that in your mind, as we talk about IT Microservices.

 

Just like each Power Ranger can stand on its own and have its own cool robot technology without affecting anyone else, each Ranger can take their powers and robots and add it to each other to make one HUGE super cool mega Ranger that can fight any beast.

 

Sometimes the Rangers had to work independently to root out the bad guys, and sometimes it takes a very big robot as a unified team to really tackle some big battles.

 

Microservices work like that in IT.

 

Each Microservice can stand on its own, like each Power Ranger. It can have its own skills, be upgraded independently, and get some really cool features—without affecting anyone else.

 

Each Microservice is very specific, just like a Power Ranger has very specific powers and skills it brings to the team.

 

But what’s cool is if you take several of these microservices and connect them together, they morph into a bigger application—just like the Power Rangers could morph into one unified giant robot ranger. This bigger application can tackle some giants that other applications and software on its own can’t.

 

Maybe that’s why giants such as Amazon and Netflix use Microservices in their IT architecture.

 

Maybe they should really call microservices: “Mighty Morphin’ Microservices!”

 

Yes, I suppose the nano-bots on Tony Stark’s Iron Man suit are microservices too. Maybe Tony uses microservices to create the nano-bots to do what they do to form Iron Man’s suit. You think?

 

  1. 13. Alert

For the last word of the week, THWACK MVP Adam Timberley gave us what amounts to D&D character cards, explaining the different personas that you may meet when working with alerts.

 

Faz f Dec 13, 2019 6:54 AM

Alerts you know,

Your Alarm clock in the Morning (this could be Mum)

When Dad is cooking and the oven beeps and dinner is ready!

At School when the dinner bell rings and you can play outside.

This are all Alerts you know

 

Mike Ashton-Moore Dec 13, 2019 9:24 AM

holy smokes, I read that and kept expecting a truncated post message

Love the detail and the archetypes - and recognize many of them, I have examples of most of them in my team.

My problem with alerts is what the intended use is.

I would advice going to the googles and searching "Red Dwarf Blue Alert"

I love my Trek/Wars etc, but Red Dwarf is aimed squarely at grown ups

 

George Sutherland Dec 13, 2019 1:00 PM

Alert: SHIELDS UP!!!!!

 

  1. Seriously.. Instinctively it's the fight or flight dilemma we face when confronted with the barrage of atomic particle pieces of information.

 

(great graphics and analysis of the people types involved.... WELL DONE!)

 

I use the STEP technique

Survey the situation

Take the appropriate action based on what is presented

Evaluate your response

Prepare for the next situation

Introduction

In a roundabout continuation to one of my previous blog posts, Did Microsoft Kill Perpetual Licensing, I’m going to look at the basic steps required for setting up an Office 365 tenant. There are a few ways you can purchase Office 365. There’s the good old credit card on a monthly subscription, bought directly from Microsoft. This is the most expensive way to buy Office 365 as you will be paying Recommended Retail Price (RRP). Then there are purchasing models typically bought from an IT reseller. The reseller typically either helps add the subscription to an existing Microsoft Enterprise agreement with licenses available on a Microsoft volume license portal, or more likely the reseller will be what’s known as a cloud solution provider (CSP). CSP licensing can be bought on monthly or yearly commitments, with prices lower than RRP. The CSP model offers great flexibility as it’s easy to increase or decrease license consumption on a monthly basis, thus you’re never overpaying for your Office 365 investment.

 

Now you may be reading this wondering what on earth is a Microsoft Office 365 Tenant? Don’t worry—you’re not alone. Although Office 365 adoption is high, I saw a statistic that something like one in five IT users use Office 365. That’s still only 20% market saturation.

 

In basic terms, an Office 365 tenant is the focal point from where you manage all the services and features of the Office 365 package. When creating an Office 365 tenant portal, you need a name for the tenant, which will make up a domain name with something like yourcompanyname.onmicrosoft.com. At a minimum, the portal will incorporate a version of Azure Active Directory for user management. Once licenses have been assigned to the users, options are open for using services like Exchange Online for email, SharePoint Online for Intranet and document library-type services, OneDrive for users’ personal document storage, and one of Microsoft’s key plays at the moment, Teams. Teams is the central communication and content collaboration platform bringing much of the Office 365 components together into one place.

 

Cool, now what?

Setting up your own Office 365 portal may seem like a daunting task, but it doesn’t have to be. I’ll walk you through the basics below.

 

At this point, I must point out I work for a cloud solution provider, so I’ve already taken the first step in creating the Microsoft tenant. You can do this any way you want—I outlined the methods of payment above.

 

However you arrive at this point, you’ll end up with a login name like admin@yourcompanyname.onmicrosoft.com. You need this to access the O365 portal at https://portal.microsoft.com.

 

When you first login, you’ll see something like this. Click on Admin.

 

The admin panel looks like this when you first log in.

 

Domain Verification

Select Setup and add a domain. Here we’ll associate your mail domain with the O365 portal.

 

I’m going to add and verify the domain snurf.uk. This O365 portal will be deleted before this article is published.

 

 

At this point, you must prove you own the domain. My domain is hosted with 123-reg. I could at this point log in to 123-reg from the O365 portal and it would sort out the domain verification for me. I’ll manually add a TXT record to show what’s involved.

 

To verify the domain, I have to add a TXT DNS record with the value specified below.

 

On the 123-reg portal, it looks like this. It’ll be similar for your DNS hosting provider.

 

Once the DNS record has been added, click Verify the Domain, and with any luck, you should see a congratulations message.

We now have a new verified domain.

 

User Creation

There are a couple of ways to create users in an O365 portal. They can be synchronized from an external source like an on-premises Active Directory, or they can be manually created on the O365 portal. I’ll show the latter here.

 

Click Users, Active users, Add a user. Notice snurf.uk is now an option for the domain suffix.

 

Fill in the user's name details.

If you have any licenses available, assign them here. I didn’t have any licenses available for this environment.

 

Assign user permissions.

 

And click Finish.

You now have a new user.

 

 

Further Config

I’ve shared some screenshots from my organization’s setup below. Once you have some licenses available in your O365 portal including Exchange online, then there’s some further DNS configuration to put in place. It’s the same idea as above when verifying the domain, but the settings below are used to configure your domain to route mail via O365, amongst other things.

 

Once all that’s in place, you’ll start to see some usage statistics.

 

Conclusion

In this how-to guide, we’ve created a basic Office 365 portal, assigned a domain to it, and created some users. We’ve also seen how to configure DNS to allow mail to route to your Office 365 portal.

 

Although this will get you to a point of having a functioning Office 365 portal with email, I would stress that you continue to configure and lock down the portal. Security and data protection are of paramount importance. Look at security offerings from Microsoft or other third-party solutions.

 

If you’d like any further clarification on any aspect of this article, please comment below and I’ll aim to get back to you.

Omar Rafik, SolarWinds Senior Manager, Federal Sales Engineering

 

Here’s an interesting article by my colleague Mav Turner with ideas for improving the management of school networks by analyzing performance and leveraging alerts and capacity planning.

 

Forty-eight percent of students currently use a computer in school, while 42% use a smartphone, according to a recent report by Cambridge International. These technologies provide students with the ability to interact and engage with content both inside and outside the classroom, and teachers with a means to provide personalized instruction.

 

Yet technology poses significant challenges for school IT administrators, particularly with regard to managing network performance, bandwidth, and cybersecurity requirements. Many educational applications are bandwidth-intensive and can lead to network slowdowns, potentially affecting students’ abilities to learn. And when myriad devices are tapping into a school’s network, it can pose security challenges and open the doors to potential hackers.

 

School IT administrators must ensure their networks are optimized and can accommodate increasing user demands driven by more connected devices. Simultaneously, they must take steps to lock down network security without compromising the use of technology for education. And they must do it all as efficiently as possible.

 

Here are a few strategies they can adopt to make their networks both speedy and safe.

 

Analyze Network Performance

 

Finding the root cause of performance issues can be difficult. Is it the application or the network?

 

Answering this question correctly requires the ability to visualize all the applications, networks, devices, and other factors affecting network performance. Administrators should be able to view all the critical network paths connecting these items, so they can pinpoint and immediately target potential issues whenever they arise.

 

Unfortunately, cloud applications like Google Classroom or Office 365 Education can make identifying errors challenging because they aren’t on the school’s network. Administrators should be able to monitor the performance of hosted applications as they would on-premises apps. They can then have the confidence to contact their cloud provider and work with them to resolve the issue.

 

Rely on Alerts

 

Automated network performance monitoring can save huge amounts of time. Alerts can quickly and accurately notify administrators of points of failure, so they don’t have to spend time hunting; the system can direct them to the issue. Alerts can be configured so only truly critical items are flagged.

 

Alerts serve other functions beyond network performance monitoring. For example, administrators can receive an alert when a suspicious device connects to the network or when a device poses a potential security threat.

 

Plan for Capacity

 

A recent report by The Consortium for School Networking indicates within the next few years, 38% of students will use, on average, two devices. Those devices, combined with the tools teachers are using, can heavily tax network bandwidth, which is already in demand thanks to broadband growth in K-12 classrooms.

 

It’s important for administrators to monitor application usage to determine which apps are consuming the most bandwidth and address problem areas accordingly. This can be done in real-time, so issues can be rectified before they have an adverse impact on everyone using the network.

 

They should also prepare for and optimize their networks to accommodate spikes in usage. These could occur during planned testing periods, for example, but they also may happen at random. Administrators should build in bandwidth to accommodate all users—and then add a small percentage to account for any unexpected peaks.

 

Tracking bandwidth usage over time can help administrators accurately plan their bandwidth needs. Past data can help indicate when to expect bandwidth spikes.

 

Indeed, time itself is a common thread among these strategies. Automating the performance and optimization of a school network can save administrators from having to do all the maintenance themselves, thereby freeing them up to focus on more value-added tasks. It can also save schools from having to hire additional technical staff, which may not fit in their budgets. Instead, they can put their money toward facilities, supplies, salaries, and other line items with a direct and positive impact on students’ education.

 

Find the full article on Today’s Modern Educator.

 

The SolarWinds trademarks, service marks, and logos are the exclusive property of SolarWinds Worldwide, LLC or its affiliates. All other trademarks are the property of their respective owners.

Garry Schmidt first got involved in IT Service Management almost 20 years ago. Since becoming the manager of the IT Operations Center at SaskPower, ITSM has become one of his main focuses. Here are part 1 and part 2 of our conversation.

 

Bruno: What have been your biggest challenges with adopting ITSM and structuring it to fit SaskPower?

Garry: One of the biggest challenges is the limited resources available. Everybody is working hard to take care of their area of responsibility. Often you introduce new things, like pushing people to invest time in problem management, for example. The grid is full. It’s always a matter of trying to get the right priority. There are so many demands on everybody all day long that even though you think investing time and improving the disciplines, you still have to figure out how to convince people the priority should be placed there. So, the cultural change aspect, the organizational change is always the difficult part. “We’ve always done it this way, and it’s always worked fine. So, what are you talking about Schmidt?”

 

Taking one step at a time and having a plan of where you want to get to. Taking those bite-sized pieces and dealing with it that way. You just can’t get approval to add a whole bunch of resources to do anything. It’s a matter of molding how we do things to shift towards the ideal instead of making the big steps.

 

It’s more of an evolution than a revolution. Mind you a big tool change or something similar gives you a platform to be able to do a fair amount of the revolution at the same time. You’ve got enough funding and dedicated resources to be able to focus on it. Most of the time, you’re not going to have that sort of thing to leverage.

 

Bruno: You’ve mentioned automation a few times in addition to problem resolution and being more predictive and proactive. When you say automation, what else are you specifically talking about?

Garry: Even things like chatbots need to be able to respond to requests and service desk contacts. I think there’s more and more capability available. The real challenge I’ve seen with AI tools is it’s hard to differentiate between those that just have a new marketing spin on an old tool versus the ones with some substance to them. And they’re expensive.

 

We need to find some automation capability to leverage the tools we’ve already got, or it’s an incremental investment rather than a wholesale replacement. The enterprise monitoring and alerting experience, I’m not willing to go down the path of replacing all our monitoring tools and bringing everything into a big, jumbo AI engine again. I’m skittish of that kind of stuff.

 

Our typical pattern has been we buy a big complicated tool, implement, and then use only a tenth of the capability. And then we go look for another tool.

 

Bruno: What recommendations would you make to someone who is about to introduce ITSM to an organization?

Garry: Don’t underestimate the amount of time it takes to handle the organizational change part of it.

 

You can’t just put together job aides and deliver training for a major change and then expect it to just catch on and go. It takes constant tending to make it grow.

 

It’s human nature: you’re not going to get everything the first time you see new processes. So, having a solid support structure in place to be able to continually coach and even evolve the things you do. We’ve changed and improved upon lots of things based on the feedback we’ve gotten from the folks within the different operational teams. Our general approach was to try and get the input and involvement of the operational teams as much as we could. But there were cases where we had to make decisions on how it was going to go and then teach people about it later. In both cases, you get smarter as you go forward.

 

This group operates a little differently than all the rest, and there are valid reasons for it, so we need to change our processes and our tools to make it work better for them. You need to make it work for them.

 

Have the mentality that our job is to make them successful.

 

You just need to have an integrated ITSM solution. We were able to make progress without one, but we hit the ceiling.

 

Bruno: Any parting words on ITSM, or your role, or the future of ITSM as you see it?

Garry: I really like the role my team and I have. Being able to influence how we do things. Being able to help the operational teams be more successful while also improving the results we provide to our users, to our customers. I’m happy with the progress we’ve made, especially over the last couple of years. Being able to dedicate time towards reviewing and improving our processes and hooking it together with the tools.

 

I think we’re going to continue to make more good progress as we further automate and evolve the way we’re doing things.

In a recent episode of a popular tech podcast, I heard the hosts say, “If you can’t automate, you and your career are going to get left behind.” Some form of this phrase has been uttered on nearly every tech podcast over the last few years. It’s a bit of hyperbole, but has some obvious truth to it. IT isn’t slowing down. It’s going faster and faster. How can you possibly keep up? By automating all the things.

 

It sounds great in principle, but what if you don’t have any experience with automation, coding, or scripting? Where do you get started? Here are three things you can do to start automating your career.

 

1. Pick-Tool

As an IT practitioner, your needs are going to be different from the people in your development org, and different needs (may) require different tools. Start by asking those around you who are already writing code. If a Dev team in your organization is leveraging Ruby in Jenkins, it makes sense to learn Ruby over something like Java. By taking this approach, there are a couple of benefits to you. First, you’re aligning with your organization. Secondly and arguably most importantly, you now have built-in resources to help you learn. It’s always refreshing when people are willing to cross the aisle to help each other become more effective. I mean, isn’t that ultimately what the whole DevOps movement is about—breaking down silos? Even if you don’t make a personal connection, by using the tools your organization is already leveraging, you’ll have code examples to study and maybe even libraries to work from.

 

What if you’re the one trying to introduce automation to your org or there are no inherent preferences built in? Well, you have a plethora of automation tools and languages to choose from. So, how do you decide? Look around your organization. Where do you run your apps? What are your operating systems? Cloud, hybrid, or on-premises, where do you run your infrastructure? These clues can help you. If your shop runs Windows or VMware, PowerShell would be a safe bet as an effective tool to start automating operations. Are you in information security or are you looking to leverage a deployment tool like Ansible? Then Python might be a better choice.

 

2. Choose-Task

Time after time, talk after talk, the most effective advice I’ve seen for how to get started automating is: pick something. Seriously, look around and pick a project. Got a repetitious task with any sort of regularity? Do you have any tasks frequently introducing human error to your environment, where scripting might help standardize? What monotony drives you crazy, and you just want to automate it out of your life? Whatever the case may be, by picking an actionable task, you can start learning your new scripting skill while doing your job. Also, the first step is often the hardest one. If you pick a task and set it as a goal, you’re much more likely to succeed vs. some sort of nebulous “someday I want to learn x.”

 

3. Invoke-Community

The people in your organization can be a great resource. If you can’t get help inside your org for whatever reason, never fear—there are lots of great resources available to you. It would be impossible to highlight the community and resources for each framework, but communities are vital to the vibrancy of a language. Communities take many forms, including user groups, meetups, forums, and conferences. The people out there sharing in the community are there because they’re passionate, and they want to share. These people and communities want to be there for you. Use them!

 

First Things Last

When I was first getting started in my career, a mentor told me, “When I’m hiring for a position and I see scripting (automation) in their experience, I take that resume and I move it to the top of my stack.” Being the young, curious, and dumb kid I was, I asked “Why?” In hindsight, it appears his answer was somewhat prophetical in talking about the DevOps movement to come: “If you can script, you can make things more efficient. If you make things more efficient, you can help more people. If you can help more people, you’re more valuable to the organization.”

 

The hardest and most important step you’ll make is to get started. If reading this article is part of your journey to automating your career, I hope you’ve found it helpful.

 

PS: If you’re still struggling to pick a language/framework, in my next post I’ll offer up my opinions on a “POssible SHortcut" to start automating effectively right away.

PPS: 12December update. I decided that it was hypocritical of me to offer up this advice and not follow it myself. I've written a companion piece for my blog where I detail out how I follow my own advice to tackle a challenge in my day job. Automate The Auditors Away - Veeam Backup Report v2 just went live this morning. I hope you find it useful.

The 2019 Writing Challenge got off to an amazing start and I’m grateful to everyone who contributed their time, energy, and talent both as the lead writers and commenters. The summary below offers up just a sample of the amazing and insightful ways in which IT pros break down difficult concepts and relate them—not just to five-year-olds, but to folks of any age who need to understand something simply and clearly.

 

Day 1: Monitoring—Leon Adato

I had the privilege of kicking off the challenge this year, and I felt there was no word more appropriate to do so with than “monitoring”

 

Jeremy Mayfield  Dec 1, 2019 8:27 AM

 

Great way to start the month. I was able to understand it. You spoke to my inner child, or maybe just me as I am now...... Being monitored is like when the kids are at Grandma’s house playing in the yard, and she pretends to be doing dishes watching everything out the kitchen window.

 

Rick Schroeder  Dec 2, 2019 2:30 AM

Are there cases when it’s better NOT to know? When might one NOT monitor and thereby provide an improvement?

 

I’m not talking about not over-monitoring, nor about monitoring unactionable items, nor alerting inappropriately.

 

Sometimes standing something on its head can provide new insight, new perspective, that can move one towards success.

 

Being able to monitor doesn’t necessarily mean one should monitor—or does it?

 

When is it “good” to not know the current conditions? Or is there ever a time for that, assuming one has not over-monitored?

 

Mathew Plunkett Dec 2, 2019 8:35 AM

rschroeder asked a question I have been thinking about since THWACKcamp. It started with the question “Am I monitoring elements just because I can or is it providing a useful metric?” The answer is that I was monitoring some elements just because it was available and those were removed. The next step was to ask “Am I monitoring something I shouldn’t?” This question started with looking for monitored elements that were not under contract but evolved into an interesting thought experiment. Are there situations in which we should not be monitoring an element? I have yet to come up with a scenario in which this is the case, but it has helped me to look at monitoring from a different perspective.

 

Day 2: Latency –Thomas LaRock

Tom’s style, and his wry wit, is on full display in this post, where he shows he can explain a technical concept not only to five-year-olds, but to preteens as well.

 

Thomas Iannelli  Dec 2, 2019 12:02 PM

In graduate school we had an exercise in our technical writing class where we took the definition and started replacing words with their definitions. This can make things simpler or it can cause quite a bit of latency in transferring thoughts to your reader.

 

Latency -

  • The delay before a transfer of data begins following an instruction for its transfer.
  • The period of time by which something is late or postponed before a transfer of data begins following an instruction for its transfer.
  • The period of time by which something causes or arranges for something to take place at a time later than that first scheduled before a transfer of data begins following an instruction for its transfer.
  • The period of time by which something causes or arranges for something to take place at a time later than that first arranged or planned to take place at a particular time before a transfer of data begins following an instruction for its transfer.
  • The period of time by which something causes or arranges for something to take place at a time later than that first arranged or planned to take place at a particular time before an act of moving data to another place begins following an instruction for its moving of data to another place.
  • The period of time by which something causes or arranges for something to take place at a time later than that first arranged or planned to take place at a particular time before an act of moving the quantities, characters, or symbols on which operations are performed by a computer, being stored and transmitted in the form of electrical signals and recorded on magnetic, optical, or mechanical recording media to another place begins following an instruction for its moving of the quantities, characters, or symbols on which operations are performed by a computer, being stored and transmitted in the form of electrical signals and recorded on magnetic, optical, or mechanical recording media to another place.
  • The period of time by which something causes or arranges for something to take place at a time later than that first arranged or planned to take place at a particular time before an act of moving the quantities, characters, or symbols on which operations are performed by a computer, being stored and transmitted in the form of electrical signals and recorded on magnetic, optical, or mechanical recording media to another place begins following a code or sequence in a computer program that defines an operation and puts it into effect for its moving of the quantities, characters, or symbols on which operations are performed by a computer, being stored and transmitted in the form of electrical signals and recorded on magnetic, optical, or mechanical recording media to another place.

 

.....and so on

 

Juan Bourn Dec 2, 2019 1:39 PM

I think I am going to enjoy these discussions this month. I have a hard time explaining things without using technical terms sometimes. Not because I don’t understand them (i.e., Einstein’s comment), but because I sometimes think only in technical terms. It’s honestly what I understand easiest. For me, latency is usually associated as a negative concept. It’s refreshing to hear it discussed in general terms, as in there’s latency in everything. Like many things in IT, it’s usually only brought to light or talked about if there’s a problem with it. So latency gets a bad rep. But it’s everywhere, in everything.

 

Jake Muszynski  Dec 2, 2019 10:42 AM

Hold on, I will reply to this when I get a chance.

 

Day 3: Metrics–Sascha Giese

One of my fellow Head Geek’s passions is food. Of course he uses this context to explain something simply.

 

Mark Roberts  Dec 4, 2019 9:49 AM

The most important fact in the first line is that to make a dough that will perform well for a pizza base a known amount of flour is necessary. This is the baseline, 1 pizza = 3.5 cups. If you needed to make 25 pizzas you now know how to determine how much flour you need 25 x 3.5 = A LOT OF PIZZA

 

Dale Fanning Dec 4, 2019 11:54 AM

Why metrics are important—those who fail to learn from history are doomed to repeat it. How can you possibly know what you need to be able to do in the future if you don’t know what you’ve done in the past?

 

Ravi Khanchandani  Dec 4, 2019 8:08 AM

Are these my School Grades

Metrics are like your Report cards—giving you grades for the past, present & future (predictive grades).

Compare the present ratings with your past and also maybe the future

Different subjects rated and measured according to the topics in the subjects

 

Day 4: NetFlow—Joe Reves

What is remarkable about the Day 4 entry is not Joe’s mastery of everything having to do with NetFlow, it’s how he encouraged everyone who commented to help contribute to the growing body of work known as “NetFlowetry.”

 

Dale Fanning Dec 5, 2019 9:27 AM

I think the hardest thing to explain about NetFlow is that all it does is tell you who has been talking to who (whom? I always forget), or not, as the case may be, and *not* what was actually said. Sadly when you explain that they don’t understand that it’s still quite useful to know and can help identify where you may need to look more deeply. If it was actual packet capture as something you’d be buried in data in seconds.

 

Farhood Nishat Dec 5, 2019 8:43 AM

They say go with the flow

but how can we get to know what is the current flow

for that we pray to god to lead us towards the correct flow

but when it comes to networks and tech

we use the netflow to get into that flow

cause a flow can be misleading

and we cant just go with the flow

 

George Sutherland Dec 4, 2019 12:23 PM

NetFlow is like watching the tides. The EBB and flow, the high and low.

 

External events such as the moon phases and storms in tides are replaced by application interactions, data transfers, bandwidth contention and so on.

 

Know what is happening is great, but the real skill is creating methods that deal with the anomalies as they occur.

 

Just another example of why our work is never boring!

 

Day 5: Logging–Mario Gomez

Mario is one of our top engineers and every day, he finds himself explaining technically complex ideas to customers of all stripes. This post shows he’s able to do it with humor as well.

 

Mike Ashton-Moore  Dec 6, 2019 10:08 AM

I always loved the Star Trek logging model.

If the plot needed it then the logs had all the excruciating detail needed to answer the question.

But security was so lax (what was Lt Worf doing all this time?)

So if the plot needed it, Lt Worf and his security detail were on vacation and the logs only contained no useful information.

 

However, the common thread was that logs only ever contained what happened, never why.

 

Michael Perkins Dec 5, 2019 5:14 PM

What’s Logging? Paul Bunyan and Babe the Blue Ox, lumberjacks and sawmills, but that’s not important now.

 

What do we do with the heaps of logs generated by all the devices and servers on our networks? So much data. What do we need to log to confirm attribution, show performance, check for anomalies, etc., and what can we let go? How do we balance keeping logs around long enough to be helpful (security and performance analyses) with not allowing them to occupy too much space or make our tools slow to unusability?

 

George Sutherland Dec 5, 2019 3:18 PM

In the land of internal audit

The edict came down to record it

 

Fast or slow good or bad

It was the information that was had

 

Some reason we knew most we did not

We collected in a folder, most times to rot

 

The volume was large, it grew and grew

Sometimes to exclusion of everything new

 

Aggregation was needed and to get some quick wins

Thank heavens we have SolarWinds

 

Day6: Observability—Zack Mutchler (MVP)

THWACK MVP Zack Mutchler delivers a one-two punch for this post—offering an ELI5 appropriate explanation but then diving deep into the details as well, for those who craved a bit more.

 

Holly Baxley Dec 6, 2019 10:36 AM

To me—monitoring and observability can seem like they do the same thing, but they’re not.

 

Monitoring -

“What’s happening?”
Observability -

“Why is this happening?”

“Should this be happening?”

“How can we stop this from happening?”

“How can we make this happen?”

 

The question is this...can we build an intelligent AI that can actually predict behavior and get to the real need behind the behavior, so we can stop chasing rabbits and having our customers say, “It’s what I asked for, but it’s not what I want.”

 

If we can do that—then we’ll have mastered observability.

 

Mike Ashton-Moore  Dec 6, 2019 10:15 AM

so—alerting on what matters, but monitor as much as you’re able—and don’t collect a metric just because it’s easy, collect it because it matters

 

Juan Bourn Dec 6, 2019 9:16 AM

So observability is only tangible from an experience stand point (what is seen by observing its behavior)? Or will there always be metrics (like Disney+ not loading)? If there are always metrics, then are observability and metrics two sides of the same coin?

 

Garry Schmidt first got involved in IT Service Management almost 20 years ago. Since becoming the manager of the IT Operations Center at SaskPower, ITSM has become one of his main focuses. Here's part 1 of our conversation.

 

Bruno: How has the support been from senior leadership, your peers, and the technical resources that have to follow these processes?

Garry: It varies from one group to the next, from one person to the next, as far as how well they receive the changes we’ve been making to processes and standards. There have been additional steps and additional discipline in a lot of cases. There are certainly those entrenched in the way they’ve always done things, so there’s always the organizational change challenges. But the formula we use gets the operational people involved in the conversations as much as possible—getting their opinions and input on how we define the standards and processes and details, so it works for them. That’s not always possible. We were under some tight timelines with the ITSM tool implementation project, so often, we had to make some decisions on things and then bring people up to speed later on, unfortunately. It’s a compromise.

 

The leadership team absolutely supports and appreciates the discipline. We made hundreds of decisions to tweak the way we were doing things and the leadership team absolutely supports the improvement in maturity and discipline we’re driving towards with our ITSM program overall.

 

Often, it’s the people at the ground floor executing these things on a day-to-day basis that you need to work a little bit more with to show them this is going to work and help you in the long run rather than just extra steps and more visibility.

 

People can get a little nervous if you have more visibility into what’s going on, more reporting, more tracking. It exposes what’s going on within the day-to-day activities within a lot of teams, which can make them nervous sometimes, until they realize the approach we always try to take is, we’re here to help them be successful. It’s not to find blame or point out issues. It’s about trying to improve the stability and reputation of IT with our business and make everybody more successful.

 

Bruno: How do you define procedures so they align with the business and your end customers? How do you know what processes to tweak?

Garry: A lot of it would come through feedback mechanisms like customer surveys for incidents. We have some primary customers we deal with on a day-to-day basis. Some of the 24/7 groups (i.e., grid control center, distribution center, outage center) rely heavily on IT throughout their day, 24/7. We’ve worked to establish a relationship with some of those groups and get involved with them, talk to them on a regular basis about how they experience our services.

 

We also get hooked in with our account managers and some of the things they hear from people. The leadership team within Technology & Security (T&S) talks to people all the time and gets a sense of how they’re doing. It’s a whole bunch of different channels to try and hear the voice of the people consuming our services.

 

Metrics are super important to us. Before our current ITSM solution, we only had our monitoring tools, which gave us some indication of when things were going bad. We could track volumes of alarms but it’s difficult to convert that into something meaningful to our users. With the new ITSM tool, we’ve got a good platform for developing all kinds of dashboards and reports. We spend a fair amount of time putting together reports reflective of the business impact or their interests. It seems to work well. We still have some room to grow in giving the people at the ground level enough visibility into how things are working. But we’ve developed reports available to all the managers. Anybody can go look at them.

 

Within our team here, we spend a fair amount of time analyzing the information we get from those reports. Our approach is to not only provide the information or the data, but to look at the picture across all these different pieces of information and reports we’re getting across all the operational processes. Then we think about what this means first of all and then what should we do to improve our performance. We send a weekly report out to all the managers and directors to summarize the results we had last week and the things we should focus on to improve.

 

Bruno: Have these reports identified any opportunities for improvement that have surprised you?

Garry: One of the main thrusts we have right now is to improve our problem management effectiveness. Those little incidents that happen all over the place, I think they have a huge effect on our company overall. What we’ve been doing is equating that to a dollar value. As an example, we’re now tracking the amount of time we spend on incident management versus problem management. You can calculate the cost of incidents to the company beyond the cost of the IT support people.

 

In August, I remember the number for that. We spent around 3,000 hours on incident management. If you use an average loaded labor rate of $75 per person, it works out to $220,000 in August. That’s a lot of money. We’re starting to track the amount of time we’re spending on problem management. In August, two hours were logged against problem management. So, $220,000 versus 150 bucks we spent trying to prevent those incidents. When you point it out that way, it gets people’s attention.

Good morning! By the time you read this post, the first full day of Black Hat in London will be complete. I share this with you because I'm in London! I haven't been here in over three years, but it feels as if I never left. I'm heading to watch Arsenal play tomorrow night, come on you gunners!

 

As always, here's a bunch of links I hope you find interesting. Cheers!

 

Hacker’s paradise: Louisiana’s ransomware disaster far from over

The scary part is that the State of Louisiana was more prepared than 90% of other government agencies (HELLO BALTIMORE!), just something to think about as ransomware intensifies.

 

How to recognize AI snake oil

Slides from a presentation I wish I'd created.

 

Now even the FBI is warning about your smart TV’s security

Better late than never, I suppose. But yeah, your TV is one of many security holes found in your home. Take the time to help family and friends understand the risks.

 

A Billion People’s Data Left Unprotected on Google Cloud Server

To be fair, it was data curated from websites. In other words, no secrets were exposed. It was an aggregated list of information about people. So, the real questions should now focus on who created such a list, and why.

 

Victims lose $4.4B to cryptocurrency crime in first 9 months of 2019

Crypto remains a scam, offering an easy way for you to lose real money.

 

Why “Always use UTC” is bad advice

Time zones remain hard.

 

You Should Know These Industry Secrets

Saw this thread in the past week and many of the answers surprised me. I thought you might enjoy them as well.

 

You never forget your new Jeep's first snow.

Omar Rafik, SolarWinds Senior Manager, Federal Sales Engineering

 

Here’s an interesting article by my colleague Jim Hansen about improving security by leveraging the phases of the CDM program and enhancing data protection by taking one step at a time.

 

The Continuous Diagnostics and Mitigation (CDM) Program, issued by the Department of Homeland Security (DHS), goes a long way toward helping agencies identify and prioritize risks and secure vulnerable endpoints.

 

How can a federal IT pro more effectively improve an agency’s endpoint and data security? The answer is multi-fold. First, incorporate the guidance provided by CDM into your cybersecurity strategy. Secondly, and in addition to CDM—develop a data protection strategy for an Internet of Things (IoT) world.

 

Discovery Through CDM

 

According to Cybersecurity and Infrastructure Security Agency (CISA), the DHS sub-agency that has released CDM, the program “provides…Federal Agencies with capabilities and tools to identify cybersecurity risks on an ongoing basis, prioritize these risks based on potential impacts, and enable cybersecurity personnel to mitigate the most significant problems first.”

 

CDM takes federal IT pros through four phases of discovery:

 

What’s on the network? Here, federal IT pros discover devices, software, security configuration settings, and software vulnerabilities.

 

Who’s on the network? Here, the goal is to discover and manage account access and privileges; trust determination for users granted access; credentials and authentication; and security-related behavioral training.

 

What’s happening on the network? This phase discovers network and perimeter components; host and device components; data at rest and in transit; and user behavior and activities.

 

How is data protected? The goal of this phase is to identify cybersecurity risks on an ongoing basis, prioritize these risks based upon potential impacts, and enable cybersecurity personnel to mitigate the most significant problems first.

 

Enhanced Data Protection

 

A lot of information is available about IoT-based environments and how best to secure that type of infrastructure. In fact, there’s so much information it can be overwhelming. The best course of action is to stick to three basic concepts to lay the groundwork for future improvements.

 

First, make sure security is built in from the start as opposed to making security an afterthought or an add-on. This should include the deployment of automated tools to scan for and alert staffers to threats as they occur. This type of round-the-clock monitoring and real-time notifications help the team react more quickly to potential threats and more effectively mitigate damage.

 

Next, assess every application for potential security risks. There are a seemingly inordinate number of external applications to track and collect data. It requires vigilance to ensure these applications are safe before they’re connected, rather than finding vulnerabilities after the fact.

 

Finally, assess every device for potential security risks. In an IoT world, there’s a whole new realm of non-standard devices and tools trying to connect. Make sure every device meets security standards; don’t allow untested or non-essential devices to connect. And, to be sure agency data is safe, set up a system to track devices by MAC and IP address, and monitor the ports and switches those devices use.

 

Conclusion

 

Security isn’t getting any easier, but there are an increasing number of steps federal IT pros can take to enhance an agency’s security posture and better protect agency data. Follow CDM guidelines, prepare for a wave of IoT devices, and get a good night’s sleep.

 

Find the full article on Government Technology Insider.

 

The SolarWinds trademarks, service marks, and logos are the exclusive property of SolarWinds Worldwide, LLC or its affiliates. All other trademarks are the property of their respective owners.

Garry Schmidt first got involved in IT service management almost 20 years ago. Since becoming the manager of the IT Operations Center at SaskPower, ITSM has become one of his main focuses.

 

Bruno: What are your thoughts on ITSM frameworks?

Garry: Most frameworks are based on best practices developed by large organizations with experience managing IT systems. The main idea is they’re based on common practice, things that work in the real world.

 

It’s still a matter of understanding the pros and cons of each of the items the framework includes, and then making pragmatic decisions about how to leverage the framework in a way that makes sense to the business. For another organization, it might be different depending on the maturity level and the tools you use to automate.

 

Bruno: Did SaskPower use an ITSM framework before you?

Garry: There was already a flavor of ITSM in practice. Most organizations, even before ITIL was invented, did incident management, problem management, and change management in some way.

 

The methods were brought in by previous service providers who were responsible for a lot of the operational processes. A large portion of IT was outsourced, so they just followed their own best practices.

 

Bruno: How has ITSM evolved under your leadership?

Garry: We were able to make progress on the maturity of our processes. But really having focused effort to review and update our processes and some of the standards we use, and then automate those standards in our ITSM tool, we were able to take a big step forward in the maturity of our ITSM processes.

 

We do periodic maturity assessments to measure our progress. Hopefully, we’ll perform another assessment not too far down the road. Our maturity has improved since implementing the ITSM tool because of the greater integration and the improvements we’ve made to our processes and standards.

 

We’ve focused mostly on the core processes: incident management, problem management, change, event, knowledge, configuration management. Some of these, especially configuration management, rely on the tools to be able to go down that path. One of the things I’ve certainly learned through this evolution is it’s super important to have a good ITSM tool to hook everything together.

 

Over the years, we’ve had a conglomerate of different tools but couldn’t link anything together. There was essentially no configuration management database (CMDB), so you have stores of information distributed out all over the place. Problem management was essentially run out of a spreadsheet. It makes it really tough to make any significant improvements when you don’t have a centralized tool.

 

Bruno: What kind of evolution do you see in the future?

Garry: There’s a huge opportunity for automation, AI Ops, and machine learning—we can feed all our events into and proactively identify things that are about to fail.

 

More automation is going to help a lot in five years or so.

 

Bruno: What kind of problems are you hoping a system like that would identify?

Garry: We’ve got some solid practices in place as far as how we handle major incidents. If we’ve got an outage of a significant system, we stand up our incident command here within the ITOC. We get all the stakeholder teams involved in the discussion. We make sure we’re hitting it hard. We’ve got a plan of how we’re going to address things, we’ve got all the right people involved, we handle the communications. It’s a well-oiled machine for dealing with those big issues, the severe incidents. Including the severe incident review at the end.

 

We have the biggest opportunity for improvement in those little incidents taking place all over the place all the time. Trying to find the common linkages and the root cause of those little things that keep annoying people all the time. I think it’s a huge area of opportunity for us, and that’s one of the places where I think artificial intelligence or machine learning technology would be a big advantage. Just being able to find those bits of information that might relate where it’s difficult for people to do things.

 

Bruno: When did you start focusing on a dedicated ITSM tool versus ad hoc tools?

Garry: In 2014 to 2015 we tried to put in an enterprise monitoring and alerting solution. We thought we could find a solution with monitoring and alerting as well as the ITSM components. We put out an RFP and selected a vendor with the intent of having the first phase be to implement their monitoring and alerting solutions, so we could get a consistent view of all the events and alarms across all the technology domains. The second phase would be implementing the ITSM solution that could ingest all the information and handle all our operational processes. It turned out the vision for a single, integrated monitoring and alerting solution didn’t work. At least not with their technology stack. We went through a pretty significant project over eighteen months. At the end of the project, we decided to divest from the technology because it wasn’t working.

 

We eventually took another run at just ITSM with a different vendor. We lost four or five years going down the first trail.

 

Bruno: Was it ITIL all the way through?

Garry: It was definitely ITIL practices or definitions as far as the process separation, but my approach to using ITIL is you don’t want to call it that necessarily. It’s just one of the things you use to figure out what the right approach is. You often run into people who are zealots of a framework because they all overlap to a certain extent. As soon as you start going at it with the mindset that ITIL will solve all our problems, you’re going to run into trouble. It’s a matter of taking the things that work from all those different frameworks and making pragmatic decisions about how it will work in your organization.

Filter Blog

By date: By tag:

SolarWinds uses cookies on its websites to make your online experience easier and better. By using our website, you consent to our use of cookies. For more information on cookies, see our cookie policy.