cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post

Three Ways to Become Data-Centric

Level 17

The conservation of quantum information is a theory that information can neither be created nor destroyed. Stephen Hawking used this theory to explain how a black hole does not consume photons like a giant cosmic eraser. It is clear to me that neither Stephen Hawking, nor any quantum physicist, has ever worked in IT.

Outside the realm of quantum mechanics, in the physical world of corporate offices, information is generated, curated, and consumed at an accelerated pace with each passing year. The similarity between the physical corporate world and the quantum mechanics realm is that this data is never destroyed.

We are now a nation, and a world, of data hoarders.

Thanks to popular processes such as DevOps, we are obsessed with telemetry and observability. System administrators are keen to collect as much diagnostic information as possible to help troubleshoot servers and applications when they fail. And the Internet of Things has a billion devices broadcasting data to be easily consumed into Azure and AWS.

All of this data hoarding is leading to an accelerated amount of ROT (Redundant, Outdated, Trivial information).

Stop the madness.

It’s time to shift our way of thinking about how we collect data. We need to become more data-centric and do less data-hoarding.

Becoming data-centric means that you define goals and problems to be solved BEFORE you collect or analyze data. Once these goals or problems are defined, you can begin the process of collecting the necessary data. You want to collect the right data to help you make informed decisions about what actions are necessary.

Here are three ways for you to get started on becoming more data-centric in your current role.

Start with the question you want answered. This doesn’t have to be a complicated question. Something as simple as, “How many times was this server rebooted?” is a fine question to ask. You could also ask, “How long does it take for a server to reboot?” These examples may seem like simple questions, but you may be surprised to find that your current data collections do not allow for an easy answer without a bit of data wrangling.

Have an end-goal statement in mind. Once you have your question(s) and you have settled on the correct data to be collected, you should think about the desired output. For example, perhaps you want to put the information into a simple slide deck. Or maybe build a real-time dashboard inside of Power BI. Knowing the end goal may influence how you collect your data.

Learn to ask good questions. Questions should help to uncover facts, not opinions. Don’t let your opinions affect how you collect or analyze your data. It is important to understand that every question is based upon assumptions. It’s up to you to decide if those assumptions are safe, and an assumption is considered safe if it is something that can be measured. For example, your gut may tell you that server reboots are a result of O/S patches being applied too frequently. Instead of asking, “How frequently are patches applied?” a better question would be, “How many patches require a reboot?” and compare that number to the overall number of server reboots.

Summary

When it comes to data, no one is perfect. These days, data is easy to come by, making it a cheap commodity. When data is cheap, attention becomes a commodity. By shifting to a data-centric nature, you can avoid data hoarding and the amount of ROT in your enterprise. With just a little bit of effort, you can make things better for yourself, your company, and help set the example for everyone else.

12 Comments
Level 14

Thanks for the article.  Good, simplistic advice.  I'd like to add a final step of reviewing what could possibly be improved upon from the 3 steps you provided.  Often times after questions are answered, the avenue to improve things is also opened up. 

Level 13

Thanks good article.

Level 15

Good article.  It would seem that we could apply our LEAN principles of continuous improvement and as part of reviewing and asking questions about the data to actually modify what is stored and/or collected.  That way we are not overloading what is collected but actually utilizing the data that is being collected more efficiently.

Level 13

Great post sqlrockstar​.  Hadn't heard the ROT term before.  Love it.  We run into this all the time.  Somehow folks have the notion that all data is good and valuable even when it's clearly garbage.  Trying to weed this stuff out seems like a sisyphean task at times.

There are MANY good concepts in this article!

  • I.T.MIGHT just be like a giant vacuum for dollars instead of a black hole for data.  Frankly, do we care if data is lost if it's not important to someone?  Correctly identifying what's important, and to whom, is the trick there.  I have a huge ability to recall events from my very-early childhood, but are they useful to anyone?  If I discarded many of them would I function less efficiently, less happily?  Maybe.  Maybe not  Certainly if I.T. discarded useless data its systems might function more quickly and cost less to operate.
  • We've always been data hoarders haven't we?  Going back in time it was simpler things like:
    • Where can water be found?
    • What's safe or unsafe to eat?
    • What is a predator and how do we avoid or defeat them?

     But today we might be hoarding other practical types of data:

    • What are my passwords?
    • What route is quickest to drive to work?
    • Where are the good parking places?
    • Where can I find (enter your favorite hobbies--big fish to catch, beautiful places to photograph anything, reliable auroras (Iceland!), special ingredients for your best recipes, good companions, the best deals on anything you might purchase, etc.)?

But there are practical limits to storage space & fast recall of these things, no matter if they're items in your mind (every boat launch I've every used, every State Park I've visited, etc.) or if they're useless / outdated bits & bytes of data (twenty-five-year-old e-mailed yes or no answers to friends' queries about whether I'd be available to toss a Frisbee after work).

Recognizing and purging useless data can be good for a computer system, and for a mind.

But being "useless" is another description of "garbage", and we all know one thing about garbage:  Something is "garbage" if you throw it away and then discover you need it after the sanitary services truck has hauled it away.

Level 16

Thanks for the great write up!

My work has a definite connection between the physical corporate realm and quantum mechanics. It's called the architecture department, or otherwise commonly known as a black hole.

If your project gets sucked into it.... it will never escape 

MVP
MVP

Nice write up

Level 13

your right - no one is perfect....  Thanks for the article.

Level 20

We do collect and save data that really isn't needed any longer.

MVP
MVP

Don't forget ROT occurs at an accelerated rate when it is aggregated and the high points and low points disappear.

MVP
MVP

At some point, we’ll view digital storage the way we view natural resources today. Some of us will want to preserve it for future generations, and others will want to consume it all for themselves. Unless we start sending our stored data into space, we’ll simply run out of disk on earth.

Level 17

Yes, that's exactly how the process should work!

About the Author
Thomas LaRock is a Head Geek at SolarWinds and a Microsoft® Certified Master, SQL Server® MVP, VMware® vExpert, and a Microsoft Certified Trainer. He has over 20 years experience in the IT industry in roles including programmer, developer, analyst, and database administrator.