The Evolving Data Engineer: Moving Further Up the Value Chain
There’s a growing narrative that AI agents will significantly reduce the need for engineers.
It goes something like this. If AI can generate code, build pipelines, write tests, automate documentation and troubleshoot issues, surely organisations will need fewer engineers.
I think that view misses what’s actually happening.
Yes, AI is rapidly reducing the amount of manual engineering effort required to produce code. Tasks that once took days can now be completed in hours. Engineers can already use AI tools to generate ingestion pipelines, write SQL transformations, create infrastructure templates, produce test cases and accelerate migration work. That capability will only improve.
While AI is making code creation faster, it’s also exposing a much bigger reality, writing code was never the highest-value part of engineering.
Data engineering teams continue to spend a significant portion of their time doing repetitive implementation work. Writing ingestion pipelines, maintaining integrations, troubleshooting failed jobs, fixing broken transformations and delivering one-off requests consume much of their engineering capacity.
In many organisations, engineers are trapped in a cycle of constant delivery pressure, where success is measured by how quickly something can get built. This model made sense when manual coding was the primary bottleneck.
It increasingly won’t.
As AI agents take on more of the repetitive engineering work, engineers will naturally move further up the value chain. Less time will be spent manually writing code, and more time will be spent ensuring what gets built is actually trustworthy, scalable and fit for enterprise use.
This is where the role becomes far more interesting.
AI can generate code quickly, but enterprise delivery is rarely constrained by code alone. The real complexity often sits elsewhere: Security requirements, compliance obligations, platform standards, operational support models, data ownership, metadata requirements, access controls and cost management.
AI can generate a pipeline, but it can’t inherently determine whether that pipeline should exist in the first place, whether it aligns to platform standards, whether it introduces operational risk, whether the right data owners are assigned, or whether downstream consumers can trust the output.
Those decisions still require engineering judgement, and that judgement is becoming more valuable.
The future data engineer will spend more time validating AI-generated outputs, ensuring architecture patterns are appropriate, testing quality and performance, reviewing security controls and confirming that solutions align to enterprise standards.
Recommended by LinkedIn
They will increasingly become responsible for making sure data products are not just technically functional, but governed, supportable and trusted. This shift becomes even more pronounced in enterprise environments.
I believe this is why we’ll see the role of the data engineer continue to evolve beyond traditional definitions. Titles such as Data Product Engineer, Data Platform Engineer, Data Reliability Engineer and Data Assurance Engineer will become increasingly common because they better reflect where the work is moving.
These roles will still require strong technical depth because engineers will absolutely continue coding. But their value will increasingly come from assurance rather than pure production.
They will review agent-generated code rather than writing everything manually. They will focus on observability rather than reactive troubleshooting. They will spend more time improving platform reliability, enforcing standards, reducing risk and accelerating trusted delivery.
In many ways, engineering is becoming more strategic, not less.
The biggest risk organisations face is not that AI will replace engineers. The real risk is organisations using AI to produce poor engineering outcomes faster.
Without strong controls, AI can accelerate technical debt, inconsistent patterns, weak testing, security issues and fragile pipelines at scale. Speed without governance simply creates larger problems later.
The organisations that succeed will be the ones that recognise this shift early. They will invest in engineers who can combine technical depth with architectural thinking, testing discipline, governance awareness, operational maturity and AI fluency.
Those engineers will be incredibly valuable.
They’ll still write code.
They just won’t be defined by it.
The real value of data engineers in an AI world will come from ensuring organisations can build trusted data and AI products faster, without compromising quality, governance or operational resilience.
And in my view, that’s a far more important role than many people realise.
Great points. The shift you're describing requires intentional investment in the right skills — not just in tooling, but in the engineers who know how to govern what AI produces. Companies that treat this as a cost-cutting exercise first will learn the hard way that speed without assurance is just faster failure.
What a great article Pat Gerace
So true 👍 good article 👏
Nice one Pat, good article.