Methodological Framework for Knowledge Graph Development

Nicolas Figay

Published Oct 19, 2025

Overcoming Pipeline Approach Limitations through Conceptual-Operational Integration

Executive Summary

Traditional pipeline approaches to knowledge graph development (Controlled Vocabularies → Standard Metadata → Taxonomies → Thesauri → Ontologies → Knowledge Graphs) are effective when guided by deep understanding of the technologies and explicit governance of their underlying principles.

However, when the rules connecting each stage remain implicit rather than formalized, early conceptual choices can become progressively more difficult to examine and adjust as they cascade into downstream artifacts.

This framework proposes to strengthen pipeline approaches by making explicit the conceptual foundations, derivation rules, and discipline-specific principles that—when formalized—enable reliable, scalable, and auditable knowledge graph development:

rigorous conceptual modeling integrated with iterative operational materialization, governed by explicit derivation rules and managed through a DCAT-based artifact repository that remains semantically and structurally traceable.

1. Problem Statement: Tacit Risks in Pipeline Approaches

1.1 Nature of the Risk

Pipeline approaches (Controlled Vocabularies → Standard Metadata → Taxonomies → Thesauri → Ontologies → Knowledge Graphs) are effective methodological frameworks when applied with deep understanding of the technologies involved and explicit governance of their underlying principles. However, when deployed without this awareness—or when their implicit rules and semantic assumptions remain tacit rather than formalized—several risks emerge:

1.2 Potential Issues When Implicit Rules Remain Unexamined

Tacit Assumptions About Conceptual Alignment

Each layer in a pipeline assumes its input has been conceptually clarified at an appropriate level for that stage
Without explicit documentation of these assumptions, practitioners may apply standardization techniques (e.g., RDF schema) to insufficiently clarified concepts
The appearance of structural formality can mask underlying conceptual ambiguity

Semantic Slippage Across Layers

When derivation rules between pipeline stages remain implicit, subtle semantic shifts can occur
A controlled vocabulary term might be interpreted differently in a taxonomy, which might be represented differently in an ontology
These slippages are difficult to detect when not explicitly documented

Discipline-Specific Tacit Knowledge

Different domains (library science, biomedics, digital humanities, engineering) have developed sophisticated practices for each pipeline stage
This domain expertise is often not transferred across disciplinary boundaries
Without explicit formalization of discipline-specific principles, practitioners may apply inappropriate rules from other domains

Reversibility and Iteration Complexity

Pipeline approaches can support iteration and refinement, but only when derivation rules are explicit
Without visibility into how downstream artifacts were derived from upstream ones, revision becomes difficult and costly
The implicit nature of derivation rules makes it unclear whether changes require re-derivation or simple amendment

Risk Amplification at Scale

Implicit assumptions and tacit rules scale poorly: what works for a small, expert team becomes unmanageable when applied across larger organizations or different disciplinary contexts
Consistency becomes difficult to verify when the principles governing each layer are not formally documented

1.3 The Framework's Role

Rather than rejecting pipeline approaches, this framework makes explicit what effective pipeline practice requires: the conceptual foundations, derivation rules, and governance principles that—when tacit—create risks but—when formalized—make pipelines powerful and reliable.

2. Proposed Framework: Conceptual-Operational Integration

2.1 Core Principles

Principle 1: Conceptual Foundation First Rigorous philosophical and domain-specific conceptualization precedes artifact creation. This is not a preliminary phase but an ongoing practice that remains active throughout the lifecycle.

Principle 2: Iterative Maturation The framework embraces iteration: conceptual models are refined through cycles of formalization, implementation, confrontation with reality, and conceptual re-elaboration. Maturity is achieved progressively, not presumed.

Principle 3: Governed Derivation Every downstream artifact is derived from upstream conceptual choices through explicit, auditable derivation rules. These rules are not implicit conventions but formalized relationships that can be verified, traced, and—when necessary—reversed.

Principle 4: Bidirectional Traceability The system maintains mappings between artifacts and their conceptual foundations in both directions: from conceptualization to materialization (forward derivation) and from artifacts back to their justifications (reverse tracing).

Principle 5: Semantic and Structural Consistency At every level, artifacts are validated for consistency both with their conceptual foundations and with each other. Inconsistencies trigger re-examination rather than being papered over with additional formalization.

2.2 Operational Architecture

┌─────────────────────────────────────────────────────────────┐
│        CONCEPTUAL FOUNDATION (Iterative Practice)           │
│  - Domain ontology (philosophical and domain-specific)      │
│  - Conceptual decisions and their justifications            │
│  - Semantic clarifications and boundary definitions         │
│  - Explicit acknowledgment of limitations and ambiguities   │
└──────────────────────┬──────────────────────────────────────┘
                       │
         ┌─────────────┴─────────────┐
         │  DERIVATION GOVERNANCE    │
         │  - Derivation rules       │
         │  - Transformation rules   │
         │  - Consistency validators │
         └─────────────┬─────────────┘
                       │
         ┌─────────────┴─────────────────────────┐
         │   DCAT-BASED ARTIFACT REPOSITORY      │
         │  (Managed by Methodological Rules)    │
         │                                       │
         │  - Controlled Vocabularies            │
         │  - Standard Metadata Schemas          │
         │  - Taxonomies                         │
         │  - Thesauri                           │
         │  - Ontologies (RDF, OWL)              │
         │  - Knowledge Graphs (RDF, Property    │
         │    Graphs, Embeddings)                │
         │                                       │
         │  With explicit versioning, lineage,   │
         │  and derivation provenance            │
         └─────────────────────────────────────┘
                       │
         ┌─────────────┴──────────────────┐
         │  ACTIVATION LAYER              │
         │  - Query governance            │
         │  - Consistency checking        │
         │  - Change impact analysis      │
         │  - Iterative refinement        │
         └────────────────────────────────┘

3. Detailed Components

3.1 Conceptual Foundation

Definition: Explicit, documented understanding of what is being modeled and why.

Comprises:

Domain Ontology: Philosophical and domain-specific clarifications of key entities, categories, relationships, and distinctions
Conceptual Decisions Log: Explicit record of choices made (e.g., "We model 'employment' as a temporal relationship, not a property")
Boundary Definitions: Clear delineation of what is in and out of scope, what ambiguities are accepted and why
Assumption Statement: Explicit articulation of assumptions underlying the model

Practice:

Rigorous before initial implementation
Revisited in each iteration cycle
Updated when inconsistencies surface or new requirements emerge

3.2 Derivation Governance

Definition: Formalized rules that define how upstream conceptual choices generate downstream artifacts.

Illustrative Examples of Derivation Rules:

Rule 1: Vocabulary Derivation

IF a conceptual choice defines entity category X with distinguishing properties P1, P2, ...Pn
THEN controlled vocabulary includes term for X with documented scope note referencing the conceptual justification
AND scope note lists the distinguishing properties
THEN any modification to the conceptual definition of P1...Pn MUST trigger review of the vocabulary term

Rule 2: Ontology-from-Taxonomy Derivation

IF a taxonomy establishes hierarchical relationship Parent > Child
AND conceptual foundation justifies this hierarchy as subsumption (Child instances are instances of Parent)
THEN OWL ontology represents this as rdfs:subClassOf
AND if the conceptual foundation later establishes it as part-of rather than subsumption
THEN OWL representation MUST change to mereological relationship (using appropriate OWL properties)

Rule 3: Knowledge Graph Population Consistency

IF an ontology defines class Person with property birthDate as xsd:date
AND conceptual foundation specifies that birthDate represents biological birth (not legal registration)
THEN KG instances MUST include provenance indicating whether dates represent biological or legal events
AND queries MUST respect this distinction or surface the ambiguity

Implementation:

Derivation rules are documented in structured form (SHACL, constraint specifications, or custom notation)
Automated validators check compliance at artifact generation time
Change management systems use these rules to identify downstream impacts of upstream modifications

3.3 DCAT-Based Artifact Repository

Definition: Centralized, structured repository of all knowledge artifacts, managed by derivation governance rules.

Artifact Types:

Controlled Vocabularies: Terms with scope notes, relationships, and links to conceptual justifications
Standard Metadata Schemas: Metadata standards applied to domain entities
Taxonomies: Hierarchical organizations with explicit relationship types
Thesauri: Rich semantic networks with synonymy, hierarchy, and associative relationships
Ontologies: Formal knowledge representations (RDF Schema, OWL)
Knowledge Graphs: Populated instances of ontologies (RDF triples, property graphs, vector embeddings)

DCAT Extensions:

Standard DCAT properties enhanced with:

dcat:derivedFrom: Links artifact to its upstream dependencies and conceptual foundations
dcat:derivationRule: References the specific rules governing this derivation
dcat:semanticVersion: Versioning that reflects semantic significance of changes
dcat:consistencyStatus: Current validation status with respect to conceptual foundation and derivation rules
dcat:justification: References to conceptual documentation justifying this artifact
dcat:fallacyRisk: Explicit acknowledgment of identified or potential logical fallacies

Repository Capabilities:

Full versioning and audit trail
Lineage tracking (artifact X was derived from artifact Y using rule Z)
Reverse dependency analysis (if artifact X changes, which downstream artifacts are affected?)
Consistency validation against derivation rules

3.4 Activation Layer

Definition: Dynamic governance practices that use the repository to maintain semantic and structural consistency through iterations.

Activation Mechanisms:

1. Consistency Checking

ON artifact_modification:
  FOR EACH downstream_artifact IN get_dependents(modified_artifact):
    validation_results = apply_derivation_rules(modified_artifact, downstream_artifact)
    IF inconsistency_detected:
      FLAG for_review(downstream_artifact, validation_results)
      ALERT stakeholders with_justification(what_changed, why_inconsistent)

2. Change Impact Analysis

Propose which artifacts require re-derivation
Estimate conceptual versus syntactic changes
Identify which domain areas are affected
Recommend priority for re-validation

3. Query Governance

SPARQL/SHACL queries can be tagged with conceptual justifications
Queries that violate derivation rules surface warnings
Queries on knowledge graphs include provenance indicating which artifacts (and versions) underpin results

4. Iterative Refinement Protocol

CYCLE:
  1. Identify inconsistency or requirement
  2. IF conceptual foundation requires revision:
       Update conceptual model with justification
       Apply derivation rules to propagate changes
       Validate all downstream artifacts
       Document decision and rationale
  3. IF only operational artifact requires revision:
       Check against derivation rules
       If compliant, update artifact
       If non-compliant, escalate to conceptual review
  4. Test against real-world usage
  5. Feed learnings back into conceptual foundation

4. Addressing Pipeline Approach Problems

4.1 Problem: Cumulative Error Propagation

Pipeline Approach: Errors introduced early persist through all layers, becoming increasingly difficult to correct.

This Framework:

Early errors surface during consistency validation against derivation rules
Derivation rules encode the why behind each artifact, making errors traceable to their conceptual source
Bidirectional traceability allows rolling back to the problematic conceptual choice
Iterative protocol ensures that foundational errors are revisited, not frozen

Mechanism: When a knowledge graph instance violates an ontology axiom, the framework traces back: Is this a data quality issue, an ontology error, or a conceptual confusion? The lineage metadata answers this question.

4.2 Problem: Semantic Opacity

Pipeline Approach: Formal appearance masks unresolved conceptual confusion.

This Framework:

Conceptual foundation is explicit and auditable—not implicit in artifact structure
Derivation rules make the relationship between concepts and formalisms transparent
DCAT repository includes "justification" metadata explaining why each artifact exists and what conceptual decision it implements
Fallacy risks are explicitly documented, not hidden in formal structure

Mechanism: An ontology axiom linked to its conceptual justification reveals whether the formalism represents genuine semantic understanding or merely syntactic standardization.

4.3 Problem: Irreversibility and Path Dependency

Pipeline Approach: Downstream dependencies on upstream errors make correction prohibitively expensive.

This Framework:

Derivation rules enable controlled reversal: changing a conceptual choice can automatically trigger re-derivation of downstream artifacts
Versioning and lineage allow parallel maintenance of "corrected" and "legacy" versions during transition periods
Change impact analysis reveals the true cost of revisions before they're committed

Mechanism: Modifying an ontology class definition automatically flags which knowledge graph assertions depend on the old definition, enabling staged migration.

4.4 Problem: Institutional Embedding of Fallacies

Pipeline Approach: Logical errors become formalized as axioms, appearing legitimate through their formal representation.

This Framework:

Explicit "fallacy risk" metadata acknowledges known or potential logical problems
Conceptual foundation must justify each axiom—fallacies cannot be justified in a rigorous conceptual analysis
Derivation rules enforce that axioms have defensible conceptual grounding
Iterative cycles include logical review as standard practice

Mechanism: An axiom implementing a logical fallacy would fail to pass conceptual justification review before being derived into downstream artifacts.

5. Implementation Approach

Annex: From Controlled Vocabulary to Ontology — Epistemic Foundations

Understanding the Critical Distinction

The framework presented in the main article assumes a clear distinction between different artifact types in the knowledge graph pipeline. However, practitioners often attempt to evolve a controlled vocabulary directly into an ontology, expecting the progression to be continuous. This annex clarifies why this approach fails and explains the fundamental epistemic differences that underpin the framework's derivation governance principles.

Two Distinct Objects, Not Two Stages

A controlled vocabulary (including thesauri and term lists) is fundamentally a prescriptive, flat resource: a collection of standardized terms with simple relationships—synonymy, generic hierarchy (generalization/specialization), thematic associations. Its objective is pragmatic standardization: ensuring consistency in indexing and information retrieval. We say "automobile" rather than "car" or "auto"; we use "economic depression" rather than "crisis" or "recession." A controlled vocabulary is governed by conventional agreement: "we use these terms in this way."

An ontology, by contrast, is a structured representation of reality itself. It does not catalog terms; it models concepts, their properties, their complex relationships, and the logical rules that govern them. An ontology asks fundamentally different questions: What is an "automobile" in relation to a "vehicle"? What are its constitutive parts? What logical relations bind it to other entities? How do we distinguish an automobile from similar entities? An ontology is not flat but multidimensional, formally structured, and—critically—logically coherent.

This is not a difference of degree or complexity. It reflects an epistemic gulf: controlled vocabularies are tools for managing agreement on terminology; ontologies are models of conceptual structure grounded in understanding of the domain itself.

The Epistemic Gap

Three systemic differences explain why vocabulary-to-ontology progression fails:

Polysemy and Granularity: A controlled vocabulary tolerates semantic ambiguity managed through convention. A term can hover between multiple interpretations as long as practitioners understand how to apply it. An ontology, however, demands radical clarification: it must distinguish the separate concepts hiding behind a single term. It must answer: are these genuinely distinct entities, or merely different applications of one concept? This question cannot be answered by extending the vocabulary—it requires reconceptualizing the domain itself.

Formalization of Logical Structure: Relations in a controlled vocabulary are declarative and flat—"X is narrower than Y," "A is related to B." These are annotations, useful but not computationally meaningful in a strong sense. An ontology requires formal logical structure: relations have precise semantics that enable inference, inheritance, constraint propagation. An axiom in an ontology is not merely a labeled edge; it is a logically valid statement that machines can reason over. This transformation cannot be achieved by adding layers of complexity to a vocabulary; it requires reconstituting the representation from the ground up in a logical framework.

Specification of Properties and Constraints: A controlled vocabulary never specifies what can be a property of a concept, or under what constraints properties apply. An ontology must formalize this explicitly: domain and range constraints, cardinality restrictions, property inheritance hierarchies. Moving from vocabulary to ontology is not an extension but a categorical shift from terminological standardization to conceptual formalization.

Why Direct Progression Fails

Attempting to "upgrade" a controlled vocabulary into an ontology by adding detail and structure creates what might be called a pseudo-ontology: formally elaborate but logically fragile, because it lacks the deep conceptual clarity that should ground an ontology.

The problems are systematic:

Accumulated Ambiguity: A vocabulary that was deliberately tolerant of semantic ambiguity becomes an ontology in which that same ambiguity is now formalized and unexamined. What was managed as pragmatic flexibility becomes embedded as logical inconsistency.

Layer Collapse: The vocabulary may conflate distinct concepts (for pragmatic terminological reasons). When formalized as an ontology, these conflations appear as logical axioms—and now it becomes costly and organizationally disruptive to separate them, since downstream applications depend on their conflation.

Missing Conceptual Grounding: An ontology derived from a vocabulary inherits no understanding of why the concepts are structured as they are. It has form without foundation. When inconsistencies emerge (and they will), there is no conceptual basis for resolving them—only the inertia of prior choices.

False Rigor: The formal appearance of ontological structure can mask the absence of genuine ontological clarity. An axiom represented in OWL is no more meaningful than the same statement in plain language if it reflects unexamined conceptual confusion. Formal notation creates an illusion of rigor that can suppress the critical examination needed to detect the confusion.

The Inverse Approach: Conceptually Grounded

The evidence—both from the framework presented in the main article and from practice—suggests that a reverse approach is far more robust: begin with rigorous conceptual modeling that clarifies what exists in the domain and how it is organized, then derive from this ontology a controlled vocabulary that reflects the conceptual structure clearly.

This inverted approach works because it respects the epistemic order:

Conceptual Clarification First: Rigorously model the concepts in your domain—not as terms, but as entities with properties, distinctions, and relationships. Ask hard questions: Is "employment" a state, a relationship, a process? What distinguishes this from "engagement" or "contract"? Document not just the answers but the reasoning.
Formalization of Logical Structure: Once concepts are clarified, represent them in a formal logical framework (OWL, SHACL, or equivalent). Specify properties, constraints, and inference rules. The formalism now has conceptual grounding, not merely syntactic elaboration.
Derivation of Vocabulary: Only after the ontology is clear, assign standardized terms to concepts. Ensure that vocabulary choices align with conceptual distinctions: synonymy now means the terms designate the same concept (not merely similar ones); hierarchy reflects genuine subsumption or parthood, not pragmatic association.

Integration with Derivation Governance

This epistemic inversion aligns directly with the framework's Derivation Governance principle. A derivation rule from ontology to controlled vocabulary might read:

Rule: For each class C in the formal ontology with scope S and distinguishing properties P1...Pn, the controlled vocabulary includes a term T such that:

T's scope note explicitly references the ontology class C and its logical definition
T's definition articulates the distinguishing properties P1...Pn
Any modification to C's logical definition triggers review of T's scope note
If the conceptual justification for C is challenged or revised, T becomes subject to re-derivation

This rule formalizes what should be intuitive: vocabulary terms are derived from ontological clarity, not the other way around. Changes flow downward from concept to term, not upward from term to concept.

Practical Implications for Pipeline Practice

For organizations using the pipeline approach described in the main article:

When You Have an Existing Controlled Vocabulary: Treat it as a data point about current practice, not as canonical. Extract the conceptual insights it embodies (often vocabulary terms reveal important distinctions), but do not assume its structure is optimal. Use it to inform conceptual modeling, not to constrain it.

When Building an Ontology: Invest heavily in conceptual work before formalizing. Document the domain model philosophically—what entities exist, why they are distinguished, what relationships hold between them. Only then move to formal representation. This is expensive upfront but prevents the accumulation of unfounded axioms.

When Standardizing Terminology: Derive your controlled vocabulary from a clear ontology (even if partial or preliminary). This ensures that vocabulary choices reflect genuine conceptual distinctions, making the vocabulary more robust and more useful for knowledge graph population and querying.

For Iteration and Refinement: When conceptual errors surface (and they will), the framework's bidirectional traceability allows you to trace from vocabulary term back to ontological axiom back to conceptual justification. You can then correct at the appropriate level—whether that is correcting a misconception in the conceptual foundation or simply adjusting terminology to better reflect a sound concept.

Conclusion

The progression from controlled vocabulary to ontology is not a pipeline but a conceptual leap. Attempting to make that leap by elaborating and formalizing the vocabulary fails because it conflates terminological standardization with conceptual modeling. The reverse—beginning with rigorous conceptual clarity and deriving vocabulary from it—respects the epistemic order and produces more robust, auditable, and maintainable knowledge structures.

Within the framework of the main article, this distinction explains why derivation governance must flow from conceptual foundation through formal ontology to downstream artifacts (including controlled vocabularies). Reversing that flow—attempting to derive conceptual clarity from vocabularies—creates the accumulation of tacit assumptions and semantic ambiguities that the framework is designed to prevent.

Responses to some raised questions about the article

Question 1: Artifact Necessity, Business Value, and Scoping

A critical question emerges when reviewing this framework: How many of these artifacts are actually needed for a given knowledge graph development? Is there real business value in developing separate controlled vocabularies, taxonomies, thesauri, and ontologies? Having developed them through to ontology, should each be separately maintained when changes are needed? Most importantly: How should the work be scoped?

There is a legitimate concern that attempting to model an entire business or domain risks "modeling for its own sake" and, proverbially, boiling the ocean. An alternative approach advocates starting with specific business use cases delivering clear value, with competency questions expressed in business language—letting those questions define the vocabulary and scope needed in the KG. Value is delivered first, then the system expands incrementally through further use cases.

My response: Context-Driven Application and Strategic Starting Points

Different tactics and strategies exist, always driven by specific needs and contexts. The framework presented here does not prescribe a universal approach but rather formalizes principles that apply across different strategic choices.

Consider a specific application domain: preparing governance and building architecture for continuous operational interoperability between partners and domains working on complex products (such as aircraft development). In such contexts, the starting point is often not a blank slate but rather legacy open and de facto standards agreed upon by communities of international experts. The challenge becomes deriving useful and relevant subsets to cover specific collaboration cases. Think of the open standard as a dictionary, and collaboration cases as sentences—you pick what you need rather than reinventing generic concepts each time.

This strategy offers significant advantages. It prevents costly alignment work that would be required if partners independently developed their own models and then tried to reconcile them. It makes explicit what is generic (drawn from standards) versus context-specific (particular to your collaboration). It provides a shared conceptual foundation from which to derive artifacts as needed.

Producing any given artifact—vocabulary, taxonomy, thesaurus, ontology—is not mandatory. It is entirely value-driven. However, if multiple artifacts are produced that address the same topic, they must be aligned for global consistency. Without this alignment, you risk fully inconsistent representations of the same knowledge across different layers of formalization. This is precisely where explicit derivation governance becomes critical: it ensures that when artifacts are created, they remain semantically and structurally consistent with each other and with their conceptual foundations.

The "conceptual foundation first" principle should be understood as: be rigorous about what you're modeling within your defined scope—not "model everything comprehensively before building anything." The framework supports starting from established standards or use case-driven competency questions, rigorous conceptualization for the bounded scope you've defined, explicit derivation rules only for artifacts that deliver value in your context, and iterative expansion guided by new use cases or collaboration requirements, not abstract completeness.

Two complementary strategies emerge. A use case-driven approach starts with specific business use cases and competency questions, builds minimal artifacts to deliver immediate value, and expands incrementally as new use cases emerge. This is appropriate for greenfield projects, exploratory domains, and rapid value delivery. A standards-driven approach starts from established domain standards (the "dictionary"), derives relevant subsets for specific collaboration contexts (the "sentences"), and builds artifacts only where alignment value justifies the cost. This is appropriate for regulated domains, multi-partner interoperability, and leveraging existing consensus.

Both strategies benefit from explicit derivation governance. Use case-driven approaches need it to maintain consistency as the system grows incrementally. Standards-driven approaches need it to ensure derived subsets remain aligned with source standards and with each other.

The framework's core message should be clarified: This framework is not about mandating artifacts or comprehensive modeling. It is about formalizing the principles that ensure semantic consistency when artifacts are created—whatever the strategic approach, whatever the scope. Whether you start from use cases or standards, create minimal artifacts or richer taxonomies, model narrowly or broadly, the framework provides explicit documentation of why each artifact exists (business value justification), clear rules for how artifacts derive from conceptual foundations or source standards, mechanisms for verifying that multiple artifacts remain consistent, and traceability that enables iteration and refinement without breaking existing work.

The framework enables rigorous execution within whatever scope your context demands—it does not dictate what that scope should be.

Pete Rivett 6mo

Nice rigor, but before getting into the detail I think the method should address the questions: - how many of these artifacts are actually needed for a given knowledge graph development: is there really business value in developing separate (controlled) vocabulary, taxonomy, thesaurus, ontology? - having developed them through to ontology, are each separately maintained in their own right when (ontology) changes are needed? - (the big one for me) how is the work scoped? I, with others, think that trying to model a whole business, or even a domain, is in danger of modeling for its own sake and proverbially boiling the ocean. I advocate a pipeline starting with a specific business use case (or a small number) delivering clear business value, with competency questions expressed in business language - that then provides the vocabulary and the scope of capabilities needed in the KG. Having delivered the business value then expand with further use cases, iterate and repeat.

6 Reactions

Marc Heerkens 6mo

As usual I tend to agree, without reading the details. Bigger problem however is, does our management even understand the difference between implicit and explicit? Some former bosses imo definitely didn’t (let alone the difference between intrinsic and extrinsic motivation)

1 Reaction

Perry (Pin) Chen, PhD 6mo

It is a very powerful knowledge engineering framework if domain-related or discipline-related rules could be established to enable traceability and relevance handling (or relationships) across the different components in DCAT repository. We're building a consolidated data architecture for enterprise data ecosystems (EDE) with a 360 view to modelling, navigation and observability of different parts or perspectives of EDE, through a similar framework or modelling approach. Curious whether such a structure could help cognitive and learning activities of future AI agents working in such as a knowledge space.

Perry (Pin) Chen, PhD 6mo

Well explained, thanks for sharing, Nicolas Figay

Aneeq Ahsan 6mo

Exactly true. Airbus and your work is getting noticed and making an impact in this uncertain world!

See more comments

To view or add a comment, sign in

Overcoming Pipeline Approach Limitations through Conceptual-Operational Integration

Executive Summary

1. Problem Statement: Tacit Risks in Pipeline Approaches

1.1 Nature of the Risk

1.2 Potential Issues When Implicit Rules Remain Unexamined

1.3 The Framework's Role

2. Proposed Framework: Conceptual-Operational Integration

2.1 Core Principles

2.2 Operational Architecture

3. Detailed Components

3.1 Conceptual Foundation

3.2 Derivation Governance

3.3 DCAT-Based Artifact Repository

3.4 Activation Layer

4. Addressing Pipeline Approach Problems

4.1 Problem: Cumulative Error Propagation

4.2 Problem: Semantic Opacity

4.3 Problem: Irreversibility and Path Dependency

4.4 Problem: Institutional Embedding of Fallacies

5. Implementation Approach

Recommended by LinkedIn

5.1 Minimum Viable Implementation

5.2 Technical Enablers

5.3 Integration with Existing Standards

6. Key Distinctions from Pipeline Approaches

7. Conclusion

8. Invitation for Feedback and Refinement

Annex: From Controlled Vocabulary to Ontology — Epistemic Foundations

Understanding the Critical Distinction

Two Distinct Objects, Not Two Stages

The Epistemic Gap

Why Direct Progression Fails

The Inverse Approach: Conceptually Grounded

Integration with Derivation Governance

Practical Implications for Pipeline Practice

Conclusion

Responses to some raised questions about the article

More articles by Nicolas Figay

Can AI become reliable infrastructure—or will its own nature limit serious adoption?

The Great "Spice" Trap: Are We Building an Intelligence Jail?

Everyone Has an Ontology Now. Almost Nobody Has an Ontology.

When Your Process Model Arrives Broken — and No One Notices

What are the different types of architects considered for Enterprise Architecture?

From Standards Chaos to Semantic Clarity: Why We Reinvented the Radar Chart

Présentation du Prisme Triadique par Rémy Fannader à l'AFIS dans le Groupe thématique AI4KM

Habiter Babel à l’ère des LLMs : contre le fantasme de l’oracle omniscient

🌐 Ten Years of Writing on Interoperability, Semantics & Architecture: A Navigational Guide (March 2026 Update)

All the articles I wrote so far - update March 2026

Others also viewed

The Ontology Pipeline™, Refresh

Revolutionizing Document Processing for Enterprise AI: Introducing Docling Open Source Library

🚀 Building an AI-Powered PDF Reader: A Two-Step Approach

Retrieval Augmented Generation (RAG)

Product categorisation: recent work

Moving Beyond Cookie-Cutter AI

Building Production-Grade Retrieval for Real Enterprise Documents

Technical Documentation in the Age of AI

The Next Level of Integrated Enterprise Knowledge: PoolParty 6.0

Why Linear Documents Fail in Complex Domains – And What Comes Next

Explore content categories