Saturday, June 3, 2017

In Praise of The Citizen Data Scientist

There was actually a really good article I read over at Data Science Central titled "The Data Science Delusion." Here is an interesting slice:

"This democratization of algorithms and platforms, paradoxically, has a downside: the signaling properties of such skills have more or less been lost. Where earlier you needed to read and understand a technical paper or a book to implement a model, now you can just use an off-the-shelf model as a black-box. While this phenomenon affects many disciplines, the vague and multidisciplinary definition of data science certainly exacerbates the problem."

It is true there is some loss of signal. However, companies may need to look for new signals as technological change progresses and new forms of capital complements labor. Its this new labor complementing role of capital (in the form of open source statistical computing packages and computing power) that is creating demand for those that can leverage these tools competently, without knowing all  "the nitty-gritty mathematical academic formulas to everything about support vector machines or Kernels and stuff like that to apply it properly and get results."

Sure, as a result there are a lot of analytics programs popping up out there to take advantage of these advances, but its also the reason programs like applied economics are becoming so popular.  In fact, in promoting its program, Johns Hopkins University almost seems to echo some of the sentiment in the quotes above, but takes a positive spin:

"Economic analysis is no longer relegated to academicians and a small number of PhD-trained specialists. Instead, economics has become an increasingly ubiquitous as well as rapidly changing line of inquiry that requires people who are skilled in analyzing and interpreting economic data, and then using it to effect decisions about national and global markets and policy, involving everything from health care to fiscal policy, from foreign aid to the environment, and from financial risk to real risk." 

In fact, I admit for a while I was a little disappointed my alma mater did not embrace the data science/analytics degree trend, or offer more courses in applied programming or incorporate languages like R into more courses. However, now, while I think these things are great I realize the more important data science skills are related to the analytical thinking and firm theoretical, statistical, and quantitative foundations that programs in economics and finance already offer at the undergraduate and masters level. While formal data science training might be the way of the future, I would venture to say that the vast majority of today's 'data scientists' were academically trained in a quantitative discipline like the above and self trained (perhaps via coursera etc.) on the skills and tools most people think of when they think of data science.  As I have said before, sometimes you don't need someone with a PhD in computer science or an astrophysics. Sometimes you really just need a good MBA that understands regression and the basics of a left join.

The DSC article above concludes with a little jab at data science, that I tend to agree with wholeheartedly:

"Great data science work is being done in various places by people who go by other names (analyst, software engineer, product head, or just plain old scientist). It is not necessary to be a card-carrying data scientist to do good data science work. Blasphemy it may be to say so, but only time will tell whether the label itself has value, or is only helping create a delusion." 

See also:

What you really need to know to be a data scientist
Super Data Science podcast - credit scoring
How to think like a data scientist to become one
What makes a great data scientist
Are data scientists going extinct
More on data science from actual data scientists

No comments:

Post a Comment