Exploring DBpedia Through SPARQL

A Practical Guide to Understanding Ontology Structure and Data Patterns in the Wild

DBpedia remains one of the richest publicly available Knowledge Graphs derived from Wikipedia content. Its structure gives a unique window into the shape of real-world data on the Web: entity types, properties, hierarchies, and semantic relationships.

This article explores DBpedia using a sequence of SPARQL queries, each designed to highlight a specific pattern or semantic capability. Every query includes:

A clickable link to run it directly against DBpedia.
An explanation of what the query uncovers and why it matters.

1. Entity Types and Representative Instances

This query lists classes (types) in DBpedia, shows a sample instance for each, and counts how many entities belong to that type.

Query

SELECT ?entityType (SAMPLE(?entity) AS ?sampleEntity) (COUNT(*) AS ?count)
WHERE {
  ?entity a ?entityType .
}
GROUP BY ?entityType
ORDER BY DESC(?count)

Run it

Type–Counts-and-Samples

Why It’s Useful

This is the fastest way to understand what types DBpedia actually contains and how many instances each type has.

It highlights:

Dominant classes (e.g., “Person”, “Place”, “Work”)
Niche or sparsely populated types
Unexpected type proliferation due to Wikipedia infobox diversity

It also provides a compact sanity check before doing deeper ontology or property exploration.

2. SubProperty/SuperProperty Exploration (Random Representative Start Points)

This query samples commonly used super-properties, selects one representative sub-property for each, and computes the full transitive hierarchy.

Query

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl:  <http://www.w3.org/2002/07/owl#>
PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT *
WHERE {
  {
    SELECT ?superProperty ?subProperty
    WHERE {
      {
        SELECT (?c AS ?superProperty) (SAMPLE(?a) AS ?subProperty) (COUNT(*) AS ?usageCount)
        WHERE {
          ?a rdfs:subPropertyOf ?c .
          ?a a ?type .
          FILTER (?type IN (owl:ObjectProperty, rdf:Property))
        }
        GROUP BY ?c
        ORDER BY DESC(?usageCount)
        LIMIT 5
      }
    }
  }
  ?subProperty rdfs:subPropertyOf* ?superProperty .
}
LIMIT 100

Run it

Random-Subproperty-Transitive-Closure

Why It’s Useful

This demonstrates:

How DBpedia’s property hierarchy is structured
Which super-properties dominate usage
How transitive closure (*) reveals inherited meaning

This is particularly useful when mapping DBpedia’s ontology to external ontologies or evaluating property alignment for integration tasks.

3. SubProperties Using the {+} Property Path Operator (Strictly Descendant Only)

This query selects the single most reused super-property and retrieves all of its sub-properties at any depth—but only those reachable via at least one rdfs:subPropertyOf relationship.

Query

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl:  <http://www.w3.org/2002/07/owl#>
PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT (?c AS ?superProperty) (?a AS ?subProperty)
WHERE {
  {
    SELECT ?c
    WHERE {
      ?a rdfs:subPropertyOf ?c .
      ?a a ?type .
      FILTER (?type IN (owl:ObjectProperty, rdf:Property))
    }
    GROUP BY ?c
    ORDER BY DESC(COUNT(?a))
    LIMIT 1
  }
  ?a rdfs:subPropertyOf+ ?c .
}
LIMIT 500

Run it

Subproperty-Strict-Descendants

Why It’s Useful

The + operator ensures that only proper descendants are returned—not the property itself.

This is ideal for:

Auditing ontology depth
Creating visual property hierarchies
Identifying redundant or overly specific properties

A Practical Guide to Understanding Ontology Structure and Data Patterns in the Wild

DBpedia remains one of the richest publicly available Knowledge Graphs derived from Wikipedia content. Its structure gives a unique window into the shape of real-world data on the Web: entity types, properties, hierarchies, and semantic relationships.

This article explores DBpedia using a sequence of SPARQL queries, each designed to highlight a specific pattern or semantic capability. Every query includes:

A clickable link to run it directly against DBpedia.
An explanation of what the query uncovers and why it matters.

1. Entity Types and Representative Instances

This query lists classes (types) in DBpedia, shows a sample instance for each, and counts how many entities belong to that type.

Query

SELECT ?entityType (SAMPLE(?entity) AS ?sampleEntity) (COUNT(*) AS ?count)
WHERE {
  ?entity a ?entityType .
}
GROUP BY ?entityType
ORDER BY DESC(?count)

Run it

Type–Counts-and-Samples

Why It’s Useful

This is the fastest way to understand what types DBpedia actually contains and how many instances each type has.

It highlights:

Dominant classes (e.g., “Person”, “Place”, “Work”)
Niche or sparsely populated types
Unexpected type proliferation due to Wikipedia infobox diversity

It also provides a compact sanity check before doing deeper ontology or property exploration.

2. SubProperty/SuperProperty Exploration (Random Representative Start Points)

This query samples commonly used super-properties, selects one representative sub-property for each, and computes the full transitive hierarchy.

Query

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl:  <http://www.w3.org/2002/07/owl#>
PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT *
WHERE {
  {
    SELECT ?superProperty ?subProperty
    WHERE {
      {
        SELECT (?c AS ?superProperty) (SAMPLE(?a) AS ?subProperty) (COUNT(*) AS ?usageCount)
        WHERE {
          ?a rdfs:subPropertyOf ?c .
          ?a a ?type .
          FILTER (?type IN (owl:ObjectProperty, rdf:Property))
        }
        GROUP BY ?c
        ORDER BY DESC(?usageCount)
        LIMIT 5
      }
    }
  }
  ?subProperty rdfs:subPropertyOf* ?superProperty .
}
LIMIT 100

Run it

Random-Subproperty-Transitive-Closure

Why It’s Useful

This demonstrates:

How DBpedia’s property hierarchy is structured
Which super-properties dominate usage
How transitive closure (*) reveals inherited meaning

This is particularly useful when mapping DBpedia’s ontology to external ontologies or evaluating property alignment for integration tasks.

3. SubProperties Using the {+} Property Path Operator (Strictly Descendant Only)

This query selects the single most reused super-property and retrieves all of its sub-properties at any depth—but only those reachable via at least one rdfs:subPropertyOf relationship.

Query

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl:  <http://www.w3.org/2002/07/owl#>
PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT (?c AS ?superProperty) (?a AS ?subProperty)
WHERE {
  {
    SELECT ?c
    WHERE {
      ?a rdfs:subPropertyOf ?c .
      ?a a ?type .
      FILTER (?type IN (owl:ObjectProperty, rdf:Property))
    }
    GROUP BY ?c
    ORDER BY DESC(COUNT(?a))
    LIMIT 1
  }
  ?a rdfs:subPropertyOf+ ?c .
}
LIMIT 500

Run it

Subproperty-Strict-Descendants

Why It’s Useful

The + operator ensures that only proper descendants are returned—not the property itself.

This is ideal for:

Auditing ontology depth
Creating visual property hierarchies
Identifying redundant or overly specific properties

4. SubProperties Using the {2} Property Path Operator

This focuses on properties that are exactly two steps below a frequently used super-property.

Query

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl:  <http://www.w3.org/2002/07/owl#>
PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT (?c AS ?superProperty) (?a AS ?subProperty)
WHERE {
  {
    SELECT ?c
    WHERE {
      ?a rdfs:subPropertyOf ?c .
      ?a a ?type .
      FILTER (?type IN (owl:ObjectProperty, rdf:Property))
    }
    GROUP BY ?c
    ORDER BY DESC(COUNT(?a))
    LIMIT 1
  }
  ?a rdfs:subPropertyOf{2} ?c .
}
LIMIT 500

Run it

Two-Hop-Subproperties

Why It’s Useful

The {2} operator gives you a controlled look at mid-depth ontology structure.

This is helpful for:

Ontology debugging
Identifying second-order refinements of major properties
Extracting property layers for tools that require bounded depth

5. Two-Hop SubProperty Exploration for the Top 10 Super-Properties

This expands the previous pattern to explore multiple major super-properties simultaneously.

Query

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl:  <http://www.w3.org/2002/07/owl#>
PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT (?c AS ?superProperty) (?a AS ?subProperty)
WHERE {
  {
    SELECT ?c
    WHERE {
      ?a rdfs:subPropertyOf ?c .
      ?a a ?type .
      FILTER (?type IN (owl:ObjectProperty, rdf:Property))
    }
    GROUP BY ?c
    ORDER BY DESC(COUNT(?a))
    LIMIT 10
  }
  ?a rdfs:subPropertyOf{2} ?c .
}
LIMIT 100

Run it

Two-Hop-Subproperties-Top10

Why It’s Useful

Powerful exploration of the DBpedia Knowledge Graph

You can quickly spot:

Consistent modeling patterns
Inconsistencies across similar property families
Opportunities for ontology normalization

6. Property Usage and Dominance

This query counts the usage of every property (predicate) in the entire knowledge graph. It is the most direct way to discover which relationships form the backbone of DBpedia.

Query

SELECT ?p (COUNT(*) AS ?usageCount)
WHERE { ?s ?p ?o }
GROUP BY ?p
ORDER BY DESC (?usageCount)

Run it

Property-Usage-Counts

Why It’s Useful

This query provides a high-level statistical overview of the graph's structure. It answers the question: "What are the most common facts stored in DBpedia?"

It helps you immediately identify:

Core RDF/RDFS properties: rdf:type, rdfs:label, rdfs:comment.
Dominant data properties: dbo:wikiPageWikiLink, dct:subject.
Metadata vs. factual properties: Distinguishing between properties about an entity (like prov:wasDerivedFrom) and properties stating a fact about the entity (like dbo:birthPlace).
The most promising properties to explore in more detail with subsequent queries.

7. Top-5 Property Hierarchies by Usage and Transitive Closure

This advanced query combines statistical analysis with ontology traversal. It first identifies the five most-used properties in the entire graph, calculates their usage count and percentage, and then finds all of their respective sub-properties at any depth.

Query

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl:  <http://www.w3.org/2002/07/owl#>
PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT ?startProperty ?usageCount ?usagePercent ?subProperty ?superProperty
WHERE {

  #####################################################################
  # 1. Determine the 5 most-used properties (global property ranking)
  #####################################################################

  {
    SELECT ?startProperty ?usageCount ?usagePercent
    WHERE {
      # Compute usage count per property
      {
        SELECT ?p (COUNT(*) AS ?usageCount)
        WHERE { ?s ?p ?o }
        GROUP BY ?p
      }

      # Compute percentage of total usage
      {
        SELECT (SUM(?cnt) AS ?totalCount)
        WHERE {
          SELECT (COUNT(*) AS ?cnt)
          WHERE { ?s ?p ?o }
        }
      }

      BIND(?p AS ?startProperty)
      BIND((100 * ?usageCount / ?totalCount) AS ?usagePercent)
    }
    ORDER BY DESC(?usageCount)
    LIMIT 5
  }

  #####################################################################
  # 2. Use the ranked properties as starting points of closure
  #####################################################################

  ?subProperty rdfs:subPropertyOf* ?startProperty .
  BIND(?startProperty AS ?superProperty)
}
LIMIT 200

Run it

Top-5-Property-Hierarchies-by-Usage

Why It’s Useful

This is the ultimate "high-impact" exploration query. It directly connects the statistical backbone of the knowledge graph (the most used properties) with its semantic structure (the property hierarchies).

This allows you to:

Prioritize analysis: Immediately focus on the ontologies of the properties that matter most in practice.
Understand semantic depth: See if a heavily used property like dct:subject is a standalone predicate or the root of a deeper hierarchy.
Discover the "semantic backbone": The results show the main pillars of the graph (rdf:type, dbo:wikiPageWikiLink, etc.) and the full scaffolding that supports them.
Guide data integration: When mapping an external schema to DBpedia, this query tells you exactly which property families are the most important to align with.

1. Entity Types and Representative Instances

Query

Run it

Why It’s Useful

2. SubProperty/SuperProperty Exploration (Random Representative Start Points)

Query

Run it

Why It’s Useful

3. SubProperties Using the {+} Property Path Operator (Strictly Descendant Only)

Query

Run it

Why It’s Useful

A Practical Guide to Understanding Ontology Structure and Data Patterns in the Wild

1. Entity Types and Representative Instances

Query

Run it

Why It’s Useful

2. SubProperty/SuperProperty Exploration (Random Representative Start Points)

Query

Recommended by LinkedIn

Run it

Why It’s Useful

3. SubProperties Using the {+} Property Path Operator (Strictly Descendant Only)

Query

Run it

Why It’s Useful

4. SubProperties Using the {2} Property Path Operator

Query

Run it

Why It’s Useful

5. Two-Hop SubProperty Exploration for the Top 10 Super-Properties

Query

Run it

Why It’s Useful

6. Property Usage and Dominance

Query

Run it

Why It’s Useful

7. Top-5 Property Hierarchies by Usage and Transitive Closure

Query

Run it

Why It’s Useful

Practical Technology Showcase

464 followers

More articles by OpenLink Software

OPAL New Release Announcement

Turbocharging the Developer Experience with the OpenLink AI Layer (OPAL)

OpenLink Software in the Age of AI

New OpenLink AI Layer (OPAL) Release for Microsoft Azure is now Live!

Immediate Availability: OpenLink AI Layer (OPAL) with MCP & A2A Support

Introducing The OpenLink AI Layer MCP Server

Generic Model Context Protocol (MCP) Server for Open Database Connectivity (ODBC)

Announcing Virtuoso 08.03.3333 Release!

Installation Guide: OpenLink SQL Server ODBC Connector (Driver) for Linux (Enterprise Edition)

Installation Guide: OpenLink SQL Server ODBC Connector (Driver) for Linux (Lite Edition)

Others also viewed

Realm of Data Science: Adepts of Qual-Quant

AIM Weekly for 23 September 2024

Data & Analytics BLOG #10: Finding your way in a jungle of numerical data

OGC Digest - September 30, 2025

Had Your Treats? Time for Data Science Tricks

Why Data Science teams fail... and how they can succeed!

Open-source language tools for GQL

The Missing Link: The Art of Data Science

Ten Things to Try in 2017: New Years Resolutions for the Intermediate Data Scientist

Data Science "Process"

Explore content categories