The Noosphere as Knowledge Graph

The Noosphere as Knowledge Graph

This is the fifth article in a series about BFS-QL, a protocol for connecting language models to knowledge graphs. The earlier articles covered why SPARQL generation fails, what the five-tool interface looks like, how to build a backend in eight methods, and why a fifth tool was added after real-world use revealed a gap. This one is about what the whole thing adds up to.


An Unintended Consequence

The canonical ID authorities -- MeSH for diseases, RxNorm for drugs, HGNC for genes, UniProt for proteins, ChEBI for small molecules -- have existed for decades. They were built for their own internal purposes: literature indexing, regulatory compliance, compound tracking, clinical coding. Nobody built them to be an interoperability layer for machine reasoning.

But that is what they are.

When two knowledge graphs both use RxNorm identifiers for drugs, they can be traversed as a single logical graph for any query that involves drugs. The LLM queries the first graph, finds RxNorm:3251, uses that ID as a seed in the second graph, and crosses the boundary. No mapping table. No federation protocol. No bilateral coordination between the graph owners. The shared identifier is the bridge, and the bridge was built decades before either graph existed.

This is not a BFS-QL design decision. It is an emergent property of the decision to build on shared canonical infrastructure. Every knowledge graph that anchors its entities to canonical ID authorities becomes automatically composable with every other graph that does the same. The degree of composability is proportional to the overlap in ID schemes. It is not a property of the protocol. It is a property of the graphs.


What Multi-Graph Reasoning Looks Like

The mechanics are simple. Start a Claude Code session. Add two MCP servers: one for a domain literature graph (biomedical papers, drug trials, clinical findings), one for DBpedia. The model sees ten tools: five BFS-QL tools prefixed with one server's name, five prefixed with the other's. The tool signatures are identical. The session workflow is identical.

A query that begins in the literature graph -- orient, resolve desmopressin to RxNorm:3251, traverse 2 hops, find connected genes and diseases -- can continue in DBpedia by using RxNorm:3251 as the seed for dbpedia.search_entities. The model carries the identifier across sources the same way a human researcher carries a known accession number across databases.

The literature graph knows what papers say about desmopressin: which studies, which findings, which patient populations, which confidence scores. The encyclopedic backbone knows what desmopressin is: its pharmacological class, its mechanism of action, its related compounds, its place in the drug taxonomy. Together they give the LLM both the frontier and the foundation. Neither graph has both. The composition does.

When graphs use different ID schemes, bridging requires an extra step: take the entity's name from the first graph, call search_entities in the second with that name, inspect the results, pick the right match. This is the same disambiguation step the LLM performs at the start of any session. The difference is that it is now cross-graph. It works, but it is slower and more ambiguous than shared-ID bridging. Composability is proportional to shared canonical identity.


It's Only As Good As The Data

It is easy, when building infrastructure, to mistake the infrastructure for the product. The MCP server starts. The tools register. The LLM connects. The system works. It is tempting to call this the achievement. It is not. The achievement is what happens next: a language model reasoning over a knowledge graph and reaching conclusions it could not reach from any single document.

The server exists to make the graph accessible. If the graph is not worth serving -- if its entities are poorly extracted, its relationships are hallucinated, its canonical IDs are inconsistent -- a flawless MCP server delivers nothing. The interface contract is only as valuable as what it connects to.

This is the correct division of labor. BFS-QL exposes whatever the graph contains. It does not validate, filter, or improve graph content. A relationship extracted with low confidence appears in results the same way a high-confidence relationship does -- the confidence score is metadata, not a gate. An interface that tries to compensate for graph quality issues would be doing the wrong work at the wrong layer.

The upstream question -- "is this graph worth serving?" -- belongs to the extraction and curation pipeline. It should be answered before the MCP server is provisioned, not after.


Active Contract, Not Passive Pipe

BFS-QL is an active contract, not a passive data pipe. A passive pipe is indifferent to how its output is used. It has no opinion about query order, session workflow, or what the caller should do with stubs. It just serves data.

BFS-QL has opinions. The tool descriptions guide the LLM toward a specific workflow: orient first, resolve names before traversing, start with topology before requesting full metadata, use describe_entity for expansion rather than re-querying. The server instructions warn about non-canonical (provisional) IDs. The topology_only flag exists because the server anticipates that full metadata is often unnecessary for the first traversal.

These are not protocol features added for completeness. They are epistemic scaffolding -- choices that encode knowledge about how LLMs reason over graphs and what patterns of use produce good outcomes. An LLM that follows the intended workflow reaches better conclusions faster, with less context waste, than one that queries arbitrarily.

The contract is active because the server is not neutral about outcomes.


Three Properties That Make It Work

Minimal, predictable, and describable are not independent virtues. They constrain each other.

A larger surface area is harder to describe accurately. An interface whose behavior depends on state, ordering, or undocumented invariants is harder to reason about. An interface that is both large and unpredictable is practically unusable by a language model -- which is why SPARQL fails as an LLM interface despite being powerful and well-designed.

BFS-QL has five tools. Each does one thing. Each has the same behavior every time it is called with the same arguments. Each is described in a tool docstring that fits in a few sentences. The model can hold the entire interface in working context simultaneously. It does not need to reason about which tool to use; the session workflow makes the order explicit. It does not need to consult documentation mid-session; the tool descriptions are self-contained.

These properties are not accidents of the implementation. They are design constraints that shaped every decision in the protocol. The tool surface emerged from asking which operations are truly distinct. The stub/full model emerged from asking how to keep response size predictable. The topology_only mode emerged from asking what the minimum useful response is. Minimal, predictable, and describable are the criteria by which each design choice was evaluated.


What Comes Next

Several directions are visible from here.

Multi-graph federation is the natural next step. The composition model described above is manual: the LLM navigates across graphs by carrying identifiers and calling search_entities in the destination graph. A federation layer would make this automatic: given a set of registered BFS-QL graphs and a query, the federation layer identifies which graphs are relevant, issues parallel BFS queries, and merges the results using shared canonical IDs as the merge key. The LLM sees one set of five tools, not N sets. The technical foundation exists. Shared canonical IDs provide the merge key. The GraphDbInterface ABC provides the interface every backend already implements. What does not yet exist is the federation engine itself.

Schema-aware query optimization would use the predicates filter to prune traversal before it happens -- pushing the filter to the backend's query rather than applying it after retrieval. For a Wikidata subgraph or a large pharmaceutical compound graph, this could reduce traversal time by an order of magnitude. For a small domain graph, it is irrelevant.

The stable layer question: backends and LLMs will both evolve. The eight-method ABC is designed to absorb those changes without propagating them. A new LLM with a larger context window gets larger BFS results from the same bfs_query tool; the interface does not change. A new traversal primitive is added as a sixth tool on the MCP server; existing backends continue to work unchanged because the new tool is implemented in the server layer in terms of the existing eight methods. This is what the design principle -- all intelligence in the server layer, all backend-specific logic behind the ABC -- actually buys. The boundary between server and backend is not just an organizational choice. It is the line along which the system can evolve without breaking the parts that work.


The Larger Argument

The biomedical, legal, chemistry, and geography communities built their identifier infrastructures -- MeSH, RxNorm, HGNC, ChEBI, PubChem, Wikidata, GeoNames -- over decades for internal purposes. They were not building an interoperability layer for LLM reasoning across knowledge domains. But that is what they built, as a side effect of building a shared commons.

The LLM is the reasoner. BFS-QL is the interface. Shared canonical IDs are the bridges between graphs. All three pieces are available right now.

That combination -- a reasoner that can follow instructions, an interface that exposes graph structure without requiring the reasoner to write queries, and an identifier infrastructure that connects graphs that never knew about each other -- is something genuinely new. It was not designed as a system. It assembled from components that were each built for something else. The emergent property was always latent in the infrastructure.

The code is at https://github.com/wware/bfs-ql.

Umm....LLMs are not ai, so says Oxford and Cambridge. The rulification presented here should be managed by closed loop intelligent robot software. I didn't decide that. These schools did. https://www.cam.ac.uk/research/news/new-ai-models-trained-on-physics-not-words-are-driving-scientific-discovery https://arxiv.org/abs/2503.16743

  • No alternative text description for this image
Like
Reply

To view or add a comment, sign in

More articles by Will Ware

  • Oops -- a fifth basic graph operation

    In the second article in this series, I wrote: "There is no fifth operation in this list. Any candidate fifth tool --…

    1 Comment
  • Building a BFS-QL Backend

    The previous articles in this series covered the problem (language models can't reliably query knowledge graphs through…

  • Four Tools Are Enough: The BFS-QL Protocol for LLM Graph Reasoning

    The previous article in this series made the case that SPARQL generation fails, RAG fails differently, and the right…

  • Your LLM Can't Get to the Knowledge in the Graph.

    Structured knowledge graphs -- DBpedia, Wikidata, UniProt, corporate Neo4j instances, domain-specific SPARQL endpoints…

    5 Comments
  • A BFS-based graph query language

    Recent work on Graph RAG included the design of a query language using breadth-first search (BFS) to provide a single…

    1 Comment

Others also viewed

Explore content categories