4. Data Science Will Belong to the Economists"We will start to see data science (to the extent that it operates as a coherent entity) increasingly rely on the domain expertise of economists. The early days of data science were very math, statistics and programming oriented. Then there was the rise of the “computational social scientist,” which added sociology to the mix. Many trend setting data science places are finding that sociology, and similar disciplines, tend to be retrospective, while other fields, like economics, offer simulation and auction modeling and other techniques to get more proactive and predictive with data. Of course, most economists don’t have the programming chops to land most data science jobs, but I think we’ll see that start to change significantly."
I think coding is important...and it would be nice if students interested in a career in data science could get more exposure to coding (SQL/R/SAS/Python etc.) in the classroom as well as algorithmic approaches (decision trees, neural networks etc.). However, I think its more important to have the analytical thinking skills and grounding in statistical inference that they get from an economics program (both UG and GR). That's the skillset I think will differentiate the data scientists in the future from the very technical tools focused ones in demand today.
Recently on EconTalk, Russ Roberts and Cathy O'Neil discuss her book Weapons of Math Destruction and they take on issues related to explaining vs predicting, causality vs fitting the data. (see also their previous episode with Susan Athey). The role of quasi-experimental methods and rigorous identification as well as theory was emphasized. And theory is something, having spent the better part of my career focusing on empirical methods (both causal inference and machine learning) that I have not given enough thought to until recently. But the more I think about it....the more I realize it is necessary. Can big data and algorithms deliver tighter, unbiased, and more truthful insights? This excerpt from the 10th edition of Heyne, Boettke, and Pryschitko's The Economic Way of Thinking leads me to think exactly the opposite:
"We can observe facts, but it takes a theory to explain the causes. It takes a theory to weed out the irrelevant facts from the relevant ones."
And they give an anecdote:
"although the facts clearly show that most pot smokers were former milk drinkers, milk drinking probably is not a relevant fact in explaining pot smoking; similarly, the Superbowl is likely irrelevant when explaining Wall Street Interactions"(even if the data does show that the Dow does well when an NFC team does well)."
And more about theory:
"Our observations of the world are in fact drenched with theory, which is why we can usually make sense out of the buzzing confusion that assaults our eyes and ears. Actually we observe only a small fraction of what we "know," a hint here and a suggestion there. The rest we fill in from the theories we hold: small and broad, vague and precise..."
Big data in many ways is buzzing confusion, and yes algorithmic approaches i.e machine learning can help us find patterns and relationships that can be useful. But relying totally on a data driven process devoid of theory is more often going to lead us down the wrong path depending on the questions we are trying to answer. Economics is a way of thinking and economic theory can help us make sense of what we find, it can help us ask better or important questions, and can help guide us to understand the answers to those questions. It is forward looking as the article above states.
Of course we can test theories using data, through some clever identification strategy or even employing methods from machine learning in conjunction with conventional econometric approaches. And this brings me back full circle to my previous post about what the most important skillsets for data scientists may be going forward, and how economics training, and in fact economic theory can help fill that niche.
Are Data Scientitsts Going Extinct
To Explain or Predict
Economists as Data Scientists
Why Study Agricultural and Applied Economics
Analytics vs Causal Inference
Culture War: Inferential Statistics vs Machine Learning
Big Data: Don't throw the baby out with the bathwater
Causal Inference and Quasi-Experimental Design Roundup
Big Data: Causality and Local Expertise are Key in Agronomic Applications
Data Scientists vs Algorithms vs Real Solutions to Real Problems