I came across a tweet by @YvesMulkers recently pointing to the following TechCrunch article:
It introduces an interesting perspective about the future work of data scientists:
Why we will continue to need data scientists:
"you need to intimately understand the problems that can be solved by data science first, which involves a very human process of interacting with the business. Crafting models will always require the subtle translation of real-world phenomena into mathematical expressions. And there is a human element to interpreting and presenting results that would be difficult to automate."
However they discuss how some of the very technical aspects of data science will become more routine or automated or modularized:
"Consider how the work of the software engineer has changed fundamentally in the last 20 years. They no longer need to write their own logging module or database access layer or UI widget. And agile methods have brought the “customer” more immediately into the development process. More and more, the job of the engineer is to stitch together higher-level components and collaborate with product managers and UX designers....Similarly, the job of the data scientist will be to take advantage of pre-built components in order to solve a greater variety of business problems. Instead of a few six-month analytics projects that focus on model accuracy and algorithmic niceties, business and analytics teams will be able to work on hundreds of projects that emphasize making concrete changes in the way business is done. And as the software available for analytics becomes more powerful, the result should be a continued steady demand for data scientists, playing a different but more prominent role in the day-to-day working of an organization."
I always thought about this from the standpoint of the .com boom in the 90's and the role of HTML programmers. This blog is case in point...instead of focusing on HTML tags (OK I know a lot of HTML has even been replaced by java script and other languages I am not aware of) but that is the point...I can focus on content, analysis, and design vs what ever script is behind this page. Will R and python coding go the same way? I'm not sure. I'm a tried and true devotee to scripting my analysis work, regardless of the language.
But this discussion makes me think of an earlier article in Deloitte Press related to the role of Analytical Translators:
"Data scientists...can make Hadoop jump through hoops,....dream in SAS or R, ...extract two years of data
from a medical device that normally dumps it after 20 minutes (a true
request)....A “light quant” is someone who knows something about
analytical and data management methods, and who also knows a lot about
specific business problems. The value of the role comes, of course, from
connecting the two."
And this is what I was getting at to some degree in a recent post:
"Sometimes you might need a PhD computer scientist or Engineer that can meet the strictest of data science thresholds, but lots of times what you really may need is a statistician, econometrician, biometrician, or just a good MBA or business analyst that understands predictive modeling, causal inference, and the basics of a left join."
Implementation of the solution is where the technical work comes into
play...and lots of code. Currently for me its SAS and R. Maybe python down the
road. However, the crux of my work is understanding the context of the
problem, and then scoping out the data requirements and methodology for
So should an experienced data scientist really sweat learning the latest and newest language...or focus more on the analytical thinking and analysis skills that ultimately drive the solution? What about an aspiring data scientist? How many languages/tools should they master? At the margin, is more time spent adding a new tool to the tool box worth more than experience solving a problem using an older or existing tool? I'm thinking learn enough to solve problems and then become a more polished problem solver and analytical translator. Languages come and go, but questions begging solutions, and the ability to provide them regardless of the tools used are never ending.
Data Scientists vs Algorithms vs Real Solutions to Real Problems http://econometricsense.blogspot.com/2016/05/data-scientists-vs-algorithms-vs.html
Economists as Data Scientists