Demand for Data Engineers exceed Data Scientists – An Analysis

Dr M Maruf Hossain, PhD, GAICD

Published Jan 26, 2021

Multiple recent recruitment surveys revealed that the demand for data engineers has recently exceeded the previous demand for data scientists. The Dice 2020 Tech Job Report said data engineer was the fastest growing job in technology with a 50% year-over-year growth in the number of open positions. Many mentees asked me how do I perceive this shifts in demand, and what area should they pursue.

Actually, I see a course correction happening in the industry. Most organisations that hired data scientists to get competitive advantage of data and advanced analytics over its competitors took the role of data scientists out of context from the hype created by the media. Many who hired data scientists present them with the work of data engineers. While in many organisations data scientists face the challenge of not having any data foundations to build analytics on.

Data scientists spend months on cleaning the data and generating insights. However, to replicate the same month-to-month insight they had to go back doing the same data cleaning and analysis on newly arrived data which often renders the insight useless due to the time taken to produce it as opposed to when the insight is actually required to take important business decisions. Ergo, businesses have realised the importance of the role of data engineers.

For many years, data scientists were the only role thought to be dealing with data. Data management and governance, data modelling and engineering all thought to be of data scientists’ area, which are not. But good for data scientists that this misconception is now clearing. To understand the difference between the work of data engineers, data scientists and other associated roles, understanding the data science value chain is necessary.

There are numerous blog posts out there which claims these roles have overlapping responsibilities, but that is far from the truth. Data engineers are focused on building infrastructure and architecture for data generation, whereas, data scientists are focused on advanced mathematics and statistical analysis on that generated data to discover actionable insights. While this positions data engineers as an IT role, data scientists are predominantly a business role. And like all business roles that depend on IT to enable technologies to do their job, data scientists depend on data engineers to enable the necessary data and technology to generate insights.

Looking at the growing demands for data engineers, it is easy to understand that organisations are now looking to recruit data engineers directly.

Modern data engineers shouldn’t be writing ETLs

If data engineers are asked to build pipelines, they will think their job is to build pipelines and consider off-the-shelf tested tools as threats to their existence instead of tremendous force multipliers. They’ll find reasons why off-the-shelf pipelines won’t actually suit the organisations very custom data needs, and reasons why analysts shouldn’t actually be building their own data transformations. They’ll write code that is fragile, hard to maintain, and non-performant. And the organisation come to rely on ‘this’ code because it’s underneath everything else the data team does.

Avoid this situation like the plague. The pace of innovation on the data team will plummet and the organisation will spend all of their time thinking about infrastructure issues that aren’t actually revenue-generating for the business.

Data Engineers are still a critical part of any high-functioning data team. Instead of building ingestion pipelines that are available off-the-shelf and implementing SQL-based data transformations, here’s what the data engineers should really be focused on:

managing and optimising data infrastructure,
building and maintaining custom ingestion pipelines,
supporting data team resources with design and performance optimisation, and
building non-SQL transformation pipelines.

Organisations need more engineers than data scientists

Jeff Magnusson wrote in his 2016 blog post, some fundamental friction between data scientists and data engineers.

Data scientists (the thinkers) are often frustrated that engineers are slow to put their ideas into production and that work cycles, road maps, and motivations are not aligned. By the time version 1 of their ideas are put into an A/B test, they already have versions 2 and 3 queued up. Their frustration is completely justified.

Data engineers (the doers) are often frustrated that data scientists produce inefficient and poorly written code, have little consideration for the maintenance cost of productionising ideas, demand unrealistic features that skew implementation effort for little gain… The list goes on, but you get the point.

Infrastructure engineers (the plumbers) get frustrated with everyone for overloading the clusters and filling up disk space. They are kept at arm’s length from the scientists and engineers, which means they never gain a solid context into how the infrastructure is being used, or the business and technical problems that it needs to be used to solve. This makes them feel powerless to improve the situation. Instead, they react by making the infrastructure more restrictive. In turn, everyone becomes frustrated with them.

While data scientists need time and space to think about novel solutions, they need to be free up from monotonic engineering work. High value insight cannot be generated without solid data foundations. That’s why organisations need more data engineers than data scientists to mine value from the data.

Going forward

The future doesn’t stop here. Along with data engineers, demand for Machine Learning Engineer and Automation Engineer will be rising too. To address the data engineers’ frustration on data scientists and to implement ideas developed by data scientists machine learning engineers will be in greater demand, if it is not already happening. Just like early days where there were many misconceptions around the definition and responsibilities of data scientists, the responsibilities of machine learning engineer are also quite clouded. While some think they are a hybrid of data engineer and data scientists, I place them at a different end of the data science value chain and consider them to have a very different skillset from both data scientist and data engineer.

Though automation engineers are around for very long, with the rise of data products, their skill will be in quite demand. They would have to obtain some knowledge to integrate data products in their solutions, but their core skill would remain mostly the same.

Concluding remarks

Even though multiple engineering roles will be on higher demand, data scientists and data storyteller will stay in the centre of the data revolution, as they’re positioned closer to the business and more relatable to business needs.

While recruiting for data related positions, it is important for all parties to ensure what the position actually needs and where the skill and interest of the candidate lies. If machine learning engineer or automation engineer role is filled with data engineers or vice versa it is highly likely that people will keep on leaving, and our recruiters will always have a busy time looking for the right candidates.

Kevin Ryan 5y

Oliver Mannion

1 Reaction

Abhishek Nayak 5y

Well said. Data foundation is the key to move towards analytics. Many times the management need people who can give them a roadmap to move towards analytics.roadmap to convert legacy into strategic system.having proper infrastructure to address these needs require data engineers , architects etc. who can setup the infrastructure,automate regular data cleaning thus reducing turn around time. Once you have proper infra in place ,you can then use it for machine learning,AI,etc. Management has realized that with proper data and analytics they can improve profits, take better decision etc. Companies are looking for people who can guide them in this journey.

Demand for Data Engineers exceed Data Scientists – An Analysis

Dr M Maruf Hossain, PhD, GAICD

Modern data engineers shouldn’t be writing ETLs

Organisations need more engineers than data scientists

Going forward

Concluding remarks

More articles by Dr M Maruf Hossain, PhD, GAICD

Others also viewed

Do you really want to hire a Data Scientist?

Exploring the Distinct Roles of Data Engineers, ML Engineers, Data Scientists, and Data Analysts

DATA SCIENTIST

DATA ANALYST VS DATA ENGINEER VS DATA SCIENTIST: WHICH DATA CAREER SHOULD YOU CHOOSE IN 2025?

Data Analyst vs. Data Scientist: Understanding the Roles, Skills, Tools, and Best Practices

Unlocking New Opportunities: The Future of Data Science Careers

What Makes a Data Scientist - Part 1

Data Scientist or Analyst: What Does Your Business Need?

Data Science Job Interview

Data Scientist vs Big Data Engineer

Importance of Data Engineers in Organizations

Data Scientist Opportunities

Skills for Data Engineering Positions That Matter

How to Justify Data Science Work to Business Teams

Data Engineering Foundations

How to Address Misconceptions in Data Roles

Explore content categories

Modern data engineers shouldn’t be writing ETLs

Organisations need more engineers than data scientists

Going forward

Concluding remarks

More articles by Dr M Maruf Hossain, PhD, GAICD

From Data to Strategic Action: Why Most Companies are Stuck at the Bottom of the Value Chain

AI Infrastructure Paradox: Why the ‘AI Bubble Burst’ is just a Hardware Correction

The AI Imperative: Lead with Vision, Communicate with Impact

AI Transformation: It Starts with Your People

The New AI Paradox: Probabilistic Risk vs. Deterministic Rule

Don't Let Shadow AI Haunt Your Enterprise: A Blueprint for Prevention & Growth

Streamlining AI Development with Effective Prompt Management

Deconstructing the Myth: The Economic Reality of AI Retraining

Unlocking AI's Full Potential: The Strategic Imperative to Move from Copilot to Autopilot

The Proactive Fallacy: The Australian Executive's Innovation Mirage

Others also viewed

Do you really want to hire a Data Scientist?

Exploring the Distinct Roles of Data Engineers, ML Engineers, Data Scientists, and Data Analysts

DATA SCIENTIST

DATA ANALYST VS DATA ENGINEER VS DATA SCIENTIST: WHICH DATA CAREER SHOULD YOU CHOOSE IN 2025?

Data Analyst vs. Data Scientist: Understanding the Roles, Skills, Tools, and Best Practices

Unlocking New Opportunities: The Future of Data Science Careers

What Makes a Data Scientist - Part 1

Data Scientist or Analyst: What Does Your Business Need?

Data Science Job Interview

Data Scientist vs Big Data Engineer

Similar topics

Importance of Data Engineers in Organizations

Data Scientist Opportunities

Skills for Data Engineering Positions That Matter

How to Justify Data Science Work to Business Teams

Data Engineering Foundations

How to Address Misconceptions in Data Roles

Explore content categories