The death of data science

The death of data science

Data science is dead. Long live data science.

In this article from July 2018, Matt Tucker makes some fascinating arguments that the data scientist of the first decade of the 21st century is dead:

  • While data science is still a hot job, the role is going the way of the travel agent and bowling pinsetter
  • The first decade of the 21st century for data scientists was dominated by the PhDs and academics because the data science focus was on the science associated with "gaining insights from the massive influx of big data"
  • Overkill is a key problem: the application of the most complicated methods to solve problems, from the most academically rigorous people can overshadow or neglect a simpler solution that someone less rigorously trained might find

 I agree. With all of it.

But it's February 2020, and it's time for a slightly different spin on the changing role of the data scientist as well as the skillset of the successful data scientist. I've got some predictions for data science in the 2020s. Before I get there, though, here's an analogy for the trajectory of data science growth that I lived through as an enterprise architect in the late 1990s:

Information technology (IT) departments ruled business. Network admins and software developers would tell the business, during strategic planning sessions, what they would and would not do. I vividly remember a development VP looking across the large conference room table at our executives and product teams who had just laid out their strategic vision for moving software online saying, "We'll think about it." I was shocked by the audacity. But IT had that kind of power at the time. It was sexy, it ran the world, and the specter of the Year 2000 issue made IT the kind of "business partner" you couldn't do without. Or at least you thought you couldn't. IT's grip on business has never lessened (arguably it's only grown stronger) but the role and responsibility has changed. Today you can't do your job without IT but much of what dominated the landscape of IT has been made available to just about anyone with an interest. Software may eat the world, but today's chefs certainly don't all graduate from the Culinary Institute of America.

Data science doesn't necessarily dominate business that way, but it's really close. The idea that only the most academically rigorous and credentialed people, the most complicated (and "correct") solutions, and the most data are the only way to solve a problem can still be found in posts and job descriptions all over the Internet, as though they're the norm and not having these solutions for your business is failing. That is, with a sincere nod to the age of the term I'm about to drop, poppycock.

In the next decade the interpreter, the liberal arts and music major, and the curious analyst will come to dominate the execution and integration of data science leveraging tools that make developing those solutions much more accessible. The highly academic will return to the shadows; they will become a critical piece to developing the user interfaces and tools that make data science relevant and available to everyone. "There will be no more data scientist roles, just roles that use data science." 

As the decade unwinds I see the role of data science broadening, becoming less and less specific to the science and more focused on solving business problems across roles less defined as data scientist. The complexities of the role will become less about academia and more about curiosity and communication. 

The Interpreter

In the next three to five years a unique leadership role will be become popular: interpreter. Job posts already point to this: data science leadership posts include "interpret and explain business results", "partner with cross-functional teams", and "manage, coach, and develop a team of data scientists". This role, like a bridge, will be technically capable of holding their own with data scientists in discussing the algorithmic solutions but capable of explaining how those solutions address and solve the problems of the business. This role must be a communicator with an empathy and understanding of the business and the data science so both sides get what they want from the exchange. These leaders could still be heavily academic but will need to be comfortable with the language of infrastructure, storage, tools, and the integration of data science-driven solutions to wherever the business needs them. This role cannot hyper-focus on accuracy (unless this is critical to success, such as reviewing an X-ray for tumor growth), documentation, or the benefits and drawbacks of coding languages.

The Liberal Arts and Music Major

 Logical thinking--breaking down a problem into its component parts--is critical to programming and data science. Liberal arts and music majors tend to foster these skills. Liberal arts, as criticized as it is and nowhere near the sexiness of the push for science, technology, engineering, and math (STEM) majors, will see a resurgence and value in the middle of the decade as they become the data scientists, the executors. Music majors and musicians have some of the best skills at pattern recognition, logical, and abstract thinking. Even the harshest critics of liberal arts and music majors have changed their minds about what makes a great programmer or data scientist. The interpreter, who has spent the first few years of the decade translating between the business and the data science team, will expand into mentorship as the integration of data science into other areas of the organization becomes more common. As the interpreter continues to grow within the organization so too will the executive leadership role of Chief Innovation Officer, or something similarly named. The Innovation officer will be a hybrid of the Chief Data Officer, Chief Information Officer, and even the Chief Technology Officer, or be a very tight partner with the IT department in the development of data storage solutions, infrastructure, and integration across information technology systems as well as departments like finance, operations, and marketing.

The Curious

By the end of the decade the dissolution of the current "data scientist" specialist will be complete. The technologies for data science will be as prevalent as Microsoft Excel, making data science-driven solutions simple to develop for anyone with the interest in building them. This does not mean that everyone will be developing target lists, segmentation, natural language or image processing. It does mean that those who want to learn what data science is, how it can be done, and the benefits to solving the problems at hand will be able to, just like anyone who wants to use a spreadsheet to manage finances, track sales, or calculate simple interest can do. Consider this, though: availability does not correlate with capability. How many co-workers do you know that don't leverage the full power of a tool like Excel? And even make what seem to you like the simplest mistakes or ask the simplest questions about how to use it? We're not all spreadsheet masters. And we don't have to be.

"There will be no more data scientist roles, just roles that use data science." While that may be true in the coming decade (or decades) the availability and capability of those for whom data science will become an expectation will still be limited to those who take the time to understand its ability to help solve problems within the context of their and their organizational or departmental roles. Much as Excel is a general spreadsheet tool that can be customized to a problem so too will data science continue its generalization as a tool that can and will be specialized to the problem at hand, encapsulating the complexities of data, infrastructure, and interpretation until a far broader set of people can leverage its output.

 Do you agree with my assessment of the shifting role of data science? What am I missing? What are your predictions about the future of data science? What technologies, tools, software or other innovations will speed up or hinder these changes? I always appreciate comments here as well as at sam.johnson@intouchsol.com.

That was excellent. As a liberal arts graduate I may be biased, but this was an informative discussion that wouldn't boil down to a pithy tweet, which is what often get from LinkedIn publications. As an applied statistics PhD and recent (meaning late) adopter of open-source analysis tools and their application to business, I appreciated your experience in the late 90's tech bubble as compared to the rising analytics hype-cycle and jargon dump we find ourselves in today.

To view or add a comment, sign in

More articles by Sam Johnson

Others also viewed

Explore content categories