The consequences of implementing strictly Shift-left architectures
Infographic generated by AI

The consequences of implementing strictly Shift-left architectures

TL;DR:

The Core Concept "Shift Left" moves the responsibility for data quality and governance from downstream analytics teams (the "cleanup crew") to the upstream source systems where data is created. Instead of fixing data later, it is treated as a high-quality product at the moment of its birth.

Key Architectural Changes

• The "Bronze" Layer Fades: Raw data dumps become obsolete. Data enters the ecosystem already validated against a contract, effectively starting at the "Silver" layer.

• Decentralised Semantics: Business logic is hard-coded at the source by domain experts, creating a "Corporate Ontology" where every system publishes its own reliable definitions.

• Trustworthy Deltas: Backend systems must provide reliable mechanisms to track changes (deltas) on complex business entities to support large datasets.

Organisational Shifts

• New Roles: Backend developers transition into "Data Providers," responsible for schema stability and fixing data bugs at the source.

• Data Engineers: Their focus shifts from building ingestion pipelines to managing governance tools and architecture.

Implementation Strategies range from adopting vendor standards, such as SAP's "Business Data Fabric," to custom "hybrid" approaches that balance upstream investment with legacy system management.


Over the last year or so, the dialogue around Shift Left architectures and working as an enabler for Data Products has intensified across the industry. What was once maybe a niche architectural preference in the modern data stack is now increasingly a topic for enterprises aiming to scale their data operations. For decades, most data architecture followed a "Shift Right" pattern, based mostly on table-oriented replication patterns where backend applications were not held responsible for defining and implementing extraction logic, and centralized data teams were left in charge to act as "data janitors"—cleaning, modeling, and fixing data in a downstream lakehouse.

The idea of Shift Left is an architectural evolution for modern data management and platforms that treats data as a governed, high-quality data extract on the level of a curated and well defined business entity or even a Data Product at the moment of its birth. By moving responsibility for quality and semantic definition to the "left", orienting it to real business entities and closer to the place where the data is really generated, the burden of data preparation is addressed where the context is strongest. With that, it’s easy to imagine how the downstream "cleanup" cycle can become unnecessary or significantly streamlined under this model.

Several key benefits have fuelled this recent momentum:

  • Faster Time-to-Insight: Data is born ready for consumption, removing the weeks or months spent in logical (re-)definition and build, as well creating ingestion and cleaning data pipelines.
  • Semantic Fidelity: Business logic is defined by those who understand the context of the data best—the source domain experts.
  • Improved Data Quality: Automated validation at the source prevents structural and logical "garbage" from entering the analytical ecosystem.

SAP has made Data Products a central pillar of its analytics and data management portfolio, particularly through SAP Business Data Fabric. This approach allows enterprises, so the promise, to leverage quickly pre-defined, semantically rich products directly from the source by using the capabilities and rich semantics of the so-called CDS-views.

1. Previously in this theatre: Extractors as the Original Data Products

While the term "Data Product" feels modern, the principle has existed for decades within the SAP ecosystem. Standard Extractors (in SAP BW) were the original proto-data products. They abstracted technical table and join complexities, managed reliable deltas, and preserved business logic (semantics) during extraction. While remaining a vendor-specific approach, it paved the way for SAP BW as the central DW solution for SAP centric data environments. Can 0MATERIAL also be seen a data product, with (most of) its extraction logic encapsulated in the 0MATERIAL* extractors?

The Shift Left movement is essentially an extension of this principle. We can now imagine, also for the “modern data stack”, that these "extractors" are moving from proprietary silos to a world where every system—SAP or otherwise—is expected to publish its data via an open, governed data Contract on business entity level.

2. The System-agnostic Catalog as a consequence

As the definition of a data products hence relies in the source system, in a Shift Left world, the Data Catalog should no longer be perceived as a component exclusive to the analytics environment. It becomes a globally valid, Cross-System Registry, centring and nurturing itself where the business logic actually lives: and this is, if consequently applied, in the ERP and backend systems.

In this model, the backend systems act as the Semantic Anchor. Definitions, business rules, and schemas are registered at the source. The Enterprise Catalog then acts as a discovery layer, allowing the analytical team to "subscribe" to these definitions rather than attempting to recreate them through guesswork.

While a Corporate Knowledge Graph can technically be built using traditional "Shift Right" methods—by retroactively mapping data in a central hub—the Shift Left approach acts as a fundamental accelerator. By mapping business entities and their relationships directly at the source, the organization moves toward a "Data Network" or Corporate Ontology. Again, SAP is going this way.

3. Effects on the Medallion Architecture

A consequent Shift Left implementation matures the traditional Medallion Architecture (Bronze ->  Silver ->  Gold) into a Readiness Hierarchy:

  • The Fading of Bronze: The raw, messy landing zone becomes obsolete. Because data is validated against a contract at the source, it bypasses the "dump" phase.
  • Silver as the "Data Product Entry": The Silver layer is no longer a processing stage for data engineers. It is the published interface of the backend team—a collection of high-quality, documented and understandable Data Products.
  • Gold as the space of synergy and joint interpretation: Gold remains the space for interpretation and combination. Here, products from different domains (e.g. Finance and sales) are joined to create holistic, cross-functional metrics and analytical applications of many types are brought into live.

4. Shift Left architectures also imply a significant shift in roles and responsibility

The most significant challenge for implementing consequently a shift-left architecture is the migration of accountability. This requires a fundamental change in team dynamics:

Article content

Specifically for SAP environments, it is important to note that SAP has effectively taken on this responsibility. As the owner and architect of the underlying backend application logic, SAP is increasingly expected to deliver corresponding data products and semantic definitions directly. This shifts the burden of initial structural definition from the customer's internal data team back to the software vendor, who must deliver accordingly to support the modern data fabric.

However, for customer-owned data producers (custom-built applications and microservices), the consequences are more demanding. These teams could no longer view data as a byproduct of their application. They must take on the role of a data provider, which implies:

  • Increased risk of Technical Debt: Development teams must build and maintain the infrastructure for data contracts and delta determination.
  • Skill Gap Challenges: Backend developers must acquire data modeling and semantic layering skills that were previously the domain of data engineers.
  • Operational Overhead: Data quality incidents are no longer "downstream problems" but production bugs that must be resolved by the source team.

A major strategic advantage of this shift is that Data Quality (DQ) and DQ fulfillment metrics are now located exactly where the data is born. By addressing DQ problems at the source, organizations can identify and resolve the root causes of errors in real-time, rather than applying reactive patches downstream. This proximity inherently leads to higher data quality and more accurate business reporting.

How other software vendors will act, and how quickly internal development teams can pivot to this new standard of accountability, remains to be seen.

5. A critical technical thing: Trustworthy Deltas

For Shift Left to sustain larger datasets, it relies on reliable delta determination with often cross-table dependencies in rather complex relations. able-oriented delta mechanisms introduce complexities on the receiver side when it comes to rebuilding the correct delta image of the businesss entity. Hence, the backend must provide a stream of changes on business entity level, not just a dump of data. Whether through transaction-like Log-Based CDC or the Transactional Outbox Pattern, the backend must guarantee that updates to complex, multi-table business objects are captured correctly. SAP’s multi-table CDC mechanism is right on track, even if for very large scenarios, more basic techniques could become necessary.

6. Dealing with Fragmented Sources

A common real-world problem occurs when multiple systems describe the same global entity. For example, three different regional SAP instances may each manage their own "Material" data. In a strict Shift-Left model, you cannot force a single backend to own the "Global Material" if they don't actually manage the other regions' data. Instead of defaulting to a centralized "Gold Layer" cleanup, consider these alternatives:

  • Regional Responsibility: Each regional system stays responsible for its own data product (the "Silver" layer). North America and Europe each publish their own high-quality Material products according to a shared contract.
  • Federated Identity Resolution: The challenge of unification is addressed by establishing a shared Semantic Standard across contracts. By mandating that each regional source provides a common global identifier (e.g., a GTIN or an MDM-supplied Global ID) as a mandatory field in its data contract, the technical "merger" is downgraded from a complex transformation to a simple join or union. This virtualization allows for a global view where source-specific attributes are treated as extensions of a core, harmonized entity, preserving local ownership while enabling global interpretability.
  • MDM Products: A dedicated Master Data Management (MDM) team publishes a "Linkage Product." Consumers then "subscribe" to both the regional Data Product and the Linkage Product to build their own global view, keeping ownership decentralized. This is a solution approach for many customers with spread out, but logical equivalent master data objects.
  • Analytical Extensions (ML-driven Attributes): Not all master data attributes are generated in the operational source. Advanced attributes like "Customer Lifetime Value" or "Churn Risk Clustering" are often the result of analytical processes or ML models. These should be treated as Data Product Extensions. they exist as specialized "Sidecar Products" that share the same global identifier. Consumers combine the operational "Silver" product with the analytical "Extension" product via a join. This is not contradictory to so-called “Closed loop scenarios”; rather, it provides a clean architectural separation. While the analytical insight is consumed as a product, the actual write-back of results into the ERP remains an operational task, ensuring the semantic definition is enriched without burdening the operational backend with analytical overhead.

7. Strategic Alignment: Adapting to Pre-Defined Semantics

SAP has taken the lead by pre-defining data products in its SAP Business Data Fabric. While this provides instant maturity (starting at "Silver"), it implies that SAP provides the core business semantics and definitions. For the customer, the choice is no longer about building the definition from scratch, but about accepting and adapting to these standardized SAP semantics. This shifts the organizational responsibility from "designing the truth" to "adhering to the standard" provided by the software vendor.

Conclusion

Consequential "Shift Left" is a clear organizational direction, but it is rarely a binary switch. It demands that backend teams stop viewing data as a byproduct and start treating it as a first-order deliverable. For many organizations, the trade-off remains a strategic choice: you either invest in upstream discipline and source-side engineering to achieve high-velocity scalability, or you rely on the proven, defensive strengths of a traditional Shift Right model to manage fragmented legacy landscapes.

The path forward will be usually a hybrid one, also given by constraints of each environment. While Shift Left provides the semantic foundation and quality required for AI-ready platforms in the place where data is generated, it succeeds only when the burden of stewardship is either automated by vendors like SAP or embraced as a core technical competency by internal development teams. Ultimately, the "chaos" of modern data is not solved by shifting it from one team to another, but by establishing a shared culture of accountability across the entire data value chain.

 

Great analysis, Philipp, it sparked a few additional thoughts. I’d argue that Shift Left is less an architectural evolution and more a return to a core principle of integrated systems, especially in SAP-centric landscapes. SAP already embodied this with R/2 and R/3: a unified model where data is born semantically clean. External data warehousing largely emerged as a workaround for technical limitations, rather than as an architectural ideal. In that context, SAP Business Data Fabric’s real differentiator isn’t just the platform, but the enterprise model SAP has built over decades across industries. The same applies to Knowledge Graphs: in SAP-aligned environments, the corporate ontology already exists, the KG primarily makes it explicit and accessible. That said, in heterogeneous landscapes with multiple ERPs or vendors, Data Products become essential as semantic contracts to compensate for the lack of a single integrated model. The key is recognising where integration already exists — and leveraging it.

Like
Reply

To view or add a comment, sign in

More articles by Philipp Nell

Others also viewed

Explore content categories