The three universal questions companies ask about big data

Credit: iStock

Surely, you've been hearing about Big Data. It's all the rage right now, but many companies don't understand how Big Data can possibly affect them. The fact is that disruption is having an impact on every organization today, but it's not just the business itself, it's how you do business including the way you make decisions -and data can help you make better decisions.

Andrew McAfee, who studies and lectures on the influence of IT on business at MIT, talked about some of the trends he's seeing in big data at the Alfresco Summit in Boston this week. He explained that most companies don't make decisions based on data. Instead, they follow whatever is the highest paid person's opinion, regardless of whether that makes sense or not. 

He says once he convinces an executive team that big data is a real phenomenon that every company needs to be looking at, they all have the same three questions.

Question One: How much data are we talking about?

Most companies believe they've been dealing with copious amounts of information for years, so when people like McAfee walk into their company talking about big data, they are skeptical. McAfee says the best way to understand just how much data we are talking about is to follow a history of how we described what we considered a large amount of data.

"The volume of data [today] is outstripping our ability to describe what's going on," McAfee told the audience in Boston. "Big is not just marketing hype, it really does mean something," he added.

He explained that in 1980, a company launched and called itself TeraData. It chose that name because 10x12 zeros was about as big as data got and that was great…for 28 years. But by 2008, 12 zeros wasn't enough and we moved to the Petabyte era (10x15 zeros). That was good for four years at which point we entered the Exabyte era (10x18 zeros). That was good for just 6 months at which point we entered the Yottabyte era (10x24 zeros).

McAfee explained that there is a group of astrophysicists and astronomers who measure really, really, really big numbers. They last met in 1991. They thought we would never need a number bigger than 10x24 zeros, but we will soon use that up and the group now has to come up with some new ways describing large numbers. 

In the next few years, there will be ever larger amounts of data coming from sensors (also known as the Internet of Things), from our smartphones, from our social networks, and from just about every data-enabled device and online service you can imagine, and some you can't. That's why unless you're Google or Facebook or perhaps a financial services company, you hasn't even sniffed the amount of data we're talking about yet.

Question 2:  So what? What does this have to do with my company?

After companies understand the scope of the data we're talking about, the next question is why it should matter to them? If they aren't a data-driven organization, what kind of impact could this trend possibly have on them?

For this, McAfee gave a few areas where just about any business could benefit from using data for predicting and forecasting. When you follow the data, you'll find the answers to your business issues.

He used the example of his colleague Erik Brynjolfsson. He and Lynn Wu set out to prove they could predict housing prices as well as or better than the experts at the National Association of Realtors. The NAR had been doing it for years and were considered the gold standard for predicting housing prices using a mix of economic indicators, GDP growth, interest rates and so forth to build a statistical model.

Brynjolfsson and Wu decided to try it a different way. Using public access to the Google search API, they looked at searches, theorizing that when someone was about to move they would be conducting certain types of searches around housing prices, schools, neighborhood quality, and so forth. They postulated that if there were a higher amount of searches, that would translate into a better housing market.

As it turned out, they were right. "They learned on average that their model was on average 23.6 % better than NAR gold star standard," McAfee told the audience. He added, "Even though data is low quality and the signal is messy there is great insight from that torrent of data."

The second area he discussed was talent management. Google, the most data-driven company around, had taken it as an article of faith that their employee interview process, in which they quizzed applicants with pithy brain teasers, was the best way to choose the strongest candidates. As it turned out, when Google applied data analysis to this belief, they found no correlation between the quality of employees and their performance on these impromptu brain quizzes. "They had to rethink their talent management practices because the data told them it wasn't working," McAfee explained.

Finally McAfee explained, data could be used for troubleshooting a problem. Imagine the data told you exactly what happened and the whole politics and blame game could be taken out of the equation. To illustrate this, McAfee told the story of a cholera outbreak in Haiti after the 2010 earthquake. As you can imagine, with countries from around the world on the ground, trying to blame one of them for the outbreak of a horrible disease was fraught with political peril. But using data from Twitter about the outbreak, officials were able to determine it started in a particular camp which was run by aid workers from a country that itself had experienced a cholera outbreak. After doing genetic testing, they determined the outbreak came from this country, and there was no denying it. The data didn't lie.

These three examples have shown companies what happens when you apply data to a problem instead of using instinct, gut, politics, or any other method of decisionmaking.

Question 3: How can my company take advantage of it?

The final question companies ask is what do they need to do to take advantage of big data, and what skills do their employees need?

The biggest challenge, however, is the way that decisions get made today in most companies. In most cases that's HiPPO: the Highest Paid Person's Opinion. This person has generally risen through the ranks and holds the most power, so their opinion is taken as mattering more than everybody else's.

Speaking as the HiPPO, McAfee said, "I make that call out of my gut, my intuition, my expertise, my track record, the stuff I have built up over several decades. That has prepared me to make tough calls. That's why you'll listen to me when it comes time to make decisions."

He added that this person doesn't ignore the data, but it's just one input in the decision making process.

He contrasts the HiPPO with the hip new data-driven decision maker. This leader says, "I'm going where the data takes me and when we see where data takes us I'm going to shut up and follow the data."

McAfee pointed to the 2012 presidential election as an example. In that race, you may recall the HiPPOs said election forecasting was about fundamentals. They were experts and understood the subtleties and fundamentals of running elections better than any data geek.

You may recall that in the days before the election, some of these HiPPOs were confidently calling for a Romney victory (to the point Romney was completely stunned when he lost and lost badly). But one person, Nate Silver from the FiveThirtyEight blog on the New York Times website, didn't listen to the noise. He knew nothing about swing states. He didn't care about soccer moms. All he did was crunch the data every day.

The data told him for months before the election that Obama was going to win, and the data grew more certain as the election approached. And as it turned out, Silver was exactly right. He called all 50 states.

But data only gets you so far, McAfee explained. You do have to ask the right questions and that takes intelligent humans with expertise As he pointed out, Picasso once said that computers were useless because they only give you answers. You need to do know what the questions are.

Join the discussion
Be the first to comment on this article. Our Commenting Policies