Are we not using data to drive decisions? [Unscientific Survey Results]

I'm starting on my next phase in education: a master's degree in information technology.  Because the master's program at my university wants a "concentration," I conducted a completely unscientific research survey to ask fellow IT professionals where they would concentrate if they were on the same journey.

Horrible Pie Chart

An overwhelming majority went with Enterprise Technology Management and (the reason for posting here) Data Analytics.  Beyond the numbers, some of the comments on this really stuck out.  Allow me to paraphrase a few together.

Using data companies acquire can help steer data driven choices, but there is frequently a gap in the way companies use their data.

This gave me pause because I've always been taught that in business, your "gut" isn't a good enough reason.  I just assumed all organizations were relying heavily on data to drive their decisions.  I, personally, haven't seen this omission, but I'm only one geek.  Which invariably leads to a great way to start discussions.

  • Is your organization using data driven decision making?
  • If it's not 100% data driven, what percentage is based on "gut" feelings?
  • If there are gut feelings, where do you think this comes from?

I've got some thoughts from previous companies, but I'd prefer to hear from my fellow THWACKsters.  I'm really looking forward to this discussion.

Parents
  • Anecdotally I have seen companies where data was not really leveraged well for a variety of reasons.

    Gaps in the originating the data - you built a hot new product but the devs were focused on just getting the features ready to ship, they didn't think ahead and implement the right kind of telemetry to capture critical parts of the user experience and behaviors.  Now we have to choose between having devs go back and rework old code to add necessary instrumentation versus shipping new features.  Better internal instrumentation rarely gets the sales org and potential customers hyped.  It's even more challenging with non-software businesses where getting the data you want usually means interrupting a human's workflow so they can record some data point that they feel is entirely meaningless to their personal responsibilities.  When I worked in restaurants cooks are supposed to be routinely tracking the temps of cold items and recording it for health inspectors, but you can rest assured that a lot of places those checks do not happen at all and the cook just pencil whips random numbers onto a piece of paper at the start of their shift before they get busy doing their "real job" of actual cooking.  Most employees struggle to tell you how much time they spent on individual tasks, even in consulting where they have to bill customers based on the time spent working for them.  Often they dont record time until Friday, and by then you get very loose approximations of how long their gut told them they were working on that thing, if not completely made up numbers to ensure that they just hit the even 40 their manager wants to see.

    Pain when trying to access the data - Is the data still on hand-filled paperwork that then needs to be processed into some kind of computerized system to ever be useful?  Is the data processing error free and fast enough to drive real time decisions? Can't tell you how many times I walked into data that was just an ocean of free text boxes where people put any number of special characters or inconsistent inputs that break when I try to turn it into structured data. Is the database available to all the people who might have a need for it to drive their decisions? Do those people even know how to construct queries or do they have to pitch requests to a business analyst who has the query language skills but likely doesn't understand the business context? So as they explore the available tables and columns they don't know that it would be helpful or not if they included certain attributes.  Maybe the fields that the business person would just naturally assume should be captured aren't in the raw data so now we have to circle back to the previous problem and improve our data collection.  But the decisions need to be made now, not in 2 quarters when the changes get implemented and we have time to collect samples.  Also is it performant and cost effective to even be able to query the data set? With the shift to -aaS paying by seat count is very common and companies often decide to ration out who has access to begin with, and then if you don't know what you are doing you can write a query that literally costs hundreds of dollars to execute and then realize you forgot to include a field you would need so you get to modify the query and incur that cost again.  With big data sets and labor costs factored in sometimes a good report legitimately cost tens of thousands of dollars to produce.  I worked with a company where raw data would be stored natively in our software, then a massaged version of that was shipped to Snowflake, then yet another layer of massaging as it was turned into leadership dashboards in Looker.  At each transformation point people were making decisions about what data they thought the layer above would be looking for, but since I was one of the few people with access and expertise to cross reference raw data to the exec level versions I would often be in meetings and have to call out that the Looker version was misleading/incomplete in some critical ways that was leading the audience to conclusions that didn't align with the on-the-ground reality I was seeing when I worked directly with our users.  Maybe we have time to let me reassemble the queries to show another story, maybe we need an answer by 2pm.  So in the end business leaders, especially those who are more executive and strategy instead of technical and detail oriented, end up having to just make what they think is the best decision hoping the quality of what they have been given was up to par. 

    Personalities - At that level there also risks being some serious egos driving what they want to see happen.  If the data conflicts with their priors, and they have any reason to question the data then it's easy to just disregard it, or apply some sort of mitigating excuse for why the numbers look like this, but they know that we need to do that.  I once worked with a very technical leader who was just the grand master BS artist.  He knew how to manipulate the people above and below him to a level where the company was willing to redirect hundreds of millions of dollars in payroll toward him and the teams he led and away from any other potential competing work managed outside his sphere of influence.  This happening despite some other people in the background raising very legit doubts that any of his projects were actually profitable to the organization.  He was the kind of guy I knew I had to record everything he said in meetings because he would promise the world in person and make it seem like his vision was the perfect solution and our exec team fell for it over and over for several years until we finally hit a critical mass of failures to deliver results and they cut him loose.  On the power of his charisma he led the execs to make several decisions in spite of contradictory or incomplete data and essentially derailed years of development because he was so certain that his vision was the best.

Reply
  • Anecdotally I have seen companies where data was not really leveraged well for a variety of reasons.

    Gaps in the originating the data - you built a hot new product but the devs were focused on just getting the features ready to ship, they didn't think ahead and implement the right kind of telemetry to capture critical parts of the user experience and behaviors.  Now we have to choose between having devs go back and rework old code to add necessary instrumentation versus shipping new features.  Better internal instrumentation rarely gets the sales org and potential customers hyped.  It's even more challenging with non-software businesses where getting the data you want usually means interrupting a human's workflow so they can record some data point that they feel is entirely meaningless to their personal responsibilities.  When I worked in restaurants cooks are supposed to be routinely tracking the temps of cold items and recording it for health inspectors, but you can rest assured that a lot of places those checks do not happen at all and the cook just pencil whips random numbers onto a piece of paper at the start of their shift before they get busy doing their "real job" of actual cooking.  Most employees struggle to tell you how much time they spent on individual tasks, even in consulting where they have to bill customers based on the time spent working for them.  Often they dont record time until Friday, and by then you get very loose approximations of how long their gut told them they were working on that thing, if not completely made up numbers to ensure that they just hit the even 40 their manager wants to see.

    Pain when trying to access the data - Is the data still on hand-filled paperwork that then needs to be processed into some kind of computerized system to ever be useful?  Is the data processing error free and fast enough to drive real time decisions? Can't tell you how many times I walked into data that was just an ocean of free text boxes where people put any number of special characters or inconsistent inputs that break when I try to turn it into structured data. Is the database available to all the people who might have a need for it to drive their decisions? Do those people even know how to construct queries or do they have to pitch requests to a business analyst who has the query language skills but likely doesn't understand the business context? So as they explore the available tables and columns they don't know that it would be helpful or not if they included certain attributes.  Maybe the fields that the business person would just naturally assume should be captured aren't in the raw data so now we have to circle back to the previous problem and improve our data collection.  But the decisions need to be made now, not in 2 quarters when the changes get implemented and we have time to collect samples.  Also is it performant and cost effective to even be able to query the data set? With the shift to -aaS paying by seat count is very common and companies often decide to ration out who has access to begin with, and then if you don't know what you are doing you can write a query that literally costs hundreds of dollars to execute and then realize you forgot to include a field you would need so you get to modify the query and incur that cost again.  With big data sets and labor costs factored in sometimes a good report legitimately cost tens of thousands of dollars to produce.  I worked with a company where raw data would be stored natively in our software, then a massaged version of that was shipped to Snowflake, then yet another layer of massaging as it was turned into leadership dashboards in Looker.  At each transformation point people were making decisions about what data they thought the layer above would be looking for, but since I was one of the few people with access and expertise to cross reference raw data to the exec level versions I would often be in meetings and have to call out that the Looker version was misleading/incomplete in some critical ways that was leading the audience to conclusions that didn't align with the on-the-ground reality I was seeing when I worked directly with our users.  Maybe we have time to let me reassemble the queries to show another story, maybe we need an answer by 2pm.  So in the end business leaders, especially those who are more executive and strategy instead of technical and detail oriented, end up having to just make what they think is the best decision hoping the quality of what they have been given was up to par. 

    Personalities - At that level there also risks being some serious egos driving what they want to see happen.  If the data conflicts with their priors, and they have any reason to question the data then it's easy to just disregard it, or apply some sort of mitigating excuse for why the numbers look like this, but they know that we need to do that.  I once worked with a very technical leader who was just the grand master BS artist.  He knew how to manipulate the people above and below him to a level where the company was willing to redirect hundreds of millions of dollars in payroll toward him and the teams he led and away from any other potential competing work managed outside his sphere of influence.  This happening despite some other people in the background raising very legit doubts that any of his projects were actually profitable to the organization.  He was the kind of guy I knew I had to record everything he said in meetings because he would promise the world in person and make it seem like his vision was the perfect solution and our exec team fell for it over and over for several years until we finally hit a critical mass of failures to deliver results and they cut him loose.  On the power of his charisma he led the execs to make several decisions in spite of contradictory or incomplete data and essentially derailed years of development because he was so certain that his vision was the best.

Children
No Data