Benchmarking Topological Deep Learning

While TDA and Deep Learning rely on a wide range of mathematical structures and lifting/transforming algorithms, TopoBench simplifies the entire research cycle. It automates the design and evaluation process by providing a ready-to-use pipeline for configuring and training various graph and topological models.

What you will learn: The basics of TopoBench architecture, configuration, training and evaluation with a simple dataset.

👉 The full article, featuring design principles, detailed implementation, in-depth analysis, and Q&A, is available on the Substack article Benchmarking Topological Deep Learning

🎯 Overview

Topological Data Analysis (TDA) and Deep Learning encompass a diverse array of mathematical frameworks, higher-order structures, and sophisticated lifting and transformation algorithms. TopoBench streamlines this complexity by automating the design and assessment of topological architectures through an integrated pipeline for the configuration, training, and evaluation of Graph and Topological Neural Networks.

🎨 Modeling & Design Principles

Topobench is a modular Python framework built to standardize benchmarks and streamline research within Topological Deep Learning (TDL). It enables the seamless training and comparative analysis of various Topological Neural Networks (TNNs) across multiple domains, including graphs, simplicial complexes, cellular complexes, and hypergraphs [ref 1].

Topological Domains

Topological Data Analysis (TDA) is a methodology that applies concepts from algebraic topology and computational geometry to analyze and extract meaningful patterns from complex datasets. It provides a geometric and topological perspective to study the shape and structure of data.

The most common topological domains - a supported in TopoX and TopoBench are

Simplicial Complexes
Cellular Complexes
Hypergraphs
Combinatorial Complexes

A simplicial complex is a graph with faces. It generalizes graphs that model higher-order relationships among data elements—not just pairwise (edges), but also triplets, quadruplets, and beyond (0-simplex: node, 1-simplex: edge, 2-simplex: Triangle, 3-simplex: Tetrahedron, ) [ref 2].

Cell complexes (or CW complexes) represent objects of flexible shape which are built out of basic ball-shaped building blocks (cells) of arbitrary dimension. Cells of different dimensions are rigidly related. For example, an area is enclosed by lines, which in turn are enclosed by points. This rigid structure describes the underlying topology. These complexes can be used as an alternative to Graph Neural Networks when data modeling requires high-order relationships [ref 3].

A hypergraph is a generalization of a graph where a hyperedge can connect any number of nodes or vertices. Similarly to Cell and Simplicial Complexes, a node and a hyperedge are said to be incident if the vertex is a member of the hyperedge. Hypergraphs are obviously ideal to encode relationships that are not strictly pairwise such as a chemical reaction of multiple molecules or a conference call involving many participants [ref 4].

Combinatorial complexes are structures used to represent topological spaces—like surfaces or multi-dimensional shapes—by breaking them down into discrete pieces like points, line segments, triangles, and their higher-dimensional counterparts. They are more flexible than Simplicial and Cell complexes.

📌 This section emphasizes complexes, assuming they are less familiar to the general reader. Note, however, that TopoBench also natively supports traditional topological domains, including graphs and point clouds. Graphs (nodes and edges) and point clouds (nodes only) are simplest forms of complexes after all.

Topological Lifting

Lifting is the mathematical and architectural process of transforming a graph (composed of nodes and edges) into a higher-order topological structure supporting cells, faces, volumes or hyperedges.

The most common scenario is the topological lifting of a graph to a simplicial complex as described and illustrated in a previous article [ref 5]. The resulting components are

0-Simplices (Nodes): The original vertices.
1-Simplices (Edges): The original connections.
2-Simplices (Faces): Created by identifying cliques (like a triangle of three connected nodes) and filling them in as a solid surface.
3-Simplices (Tetrahedra): Created by identifying fully connected groups of four nodes.
…

There are many lifting techniques, the most common among them being

Clique Complex Lifting: This is the most common method. Every k-clique (a set of k nodes where every node is connected to every other node) in the graph is mapped to a (k-1) simplex. For example, every 3-node cycle becomes a triangular face.
Ring/Cycle Lifting: Often used in chemistry (for molecules like Benzene), this method identifies cycles of a certain length and lifts them to 2-cells. This is particularly useful for Cellular Neural Networks.
Curvature-Based Lifting: Using measures like Ollivier-Ricci curvature, you can lift regions of high connectivity or specific geometric properties into higher-order structures to better represent the manifold’s “flow.”

While graph-to-topological-complex transformations are the standard entry point for TDL, the lifting process is not limited to graphs; one topological complex can be further lifted into another, more sophisticated higher-order structure.

📌 This article focuses on graph and topological complexes. However, a simple topological structure, point set or point cloud can be also lifting into a graph or any of the complexes.

TopoBench

TopoBench provides a unified benchmarking infrastructure for Topological Deep Learning (TDL) by integrating and expanding upon current software tools. It combines NetworkX for graph processing with the TopoX suite—specifically TopoNetX [ref 6] for building complex structures and TopoModelX for model implementation. Additionally, it supports out-of-the box PyTorch Geometric (PyG) [ref 7] models and original research code.

At its core, TopoBench is a unified and flexible workflow that supports a variety of datasets, data transformation, preprocessing methods along with deep learning models (e.g., Graph Neural Networks) and customizable metrics.

The key components are

Customizable loader: Extends the capability of PyG’s InMemoryDataset.
Data loader: Provides interface to data batch for graphs, hypergraphs, simplicial, cell and combinatorial complexes.
Pre-processor: Defines a pipeline of sequential transforms that processes dataset only once.
Transforms: Inherited PyG’s BaseTransform, categorized as Data manipulation, Topology lifting and full or feature lifting.
Models: Torch Lightning modules - Neural networks defined as backbone Topological Neural Network - can be imported from either PyG or TopoX.

A standout capability of TopoBench is its support for 'lifting,' which allows users to transform basic graph data into higher-order topological structures. This process enables the elevation of both raw features and the underlying connectivity, with the primary transformation routes detailed in the diagram below.

⚙️ Hands‑on with Python

Environment

Libraries: Python 3.12.5, Numpy 2.3.5, PyTorch 2.7.2, TopoBench, Omega 0.4.0
Implementation code: geometriclearning/topology/topo_bench_wrapper.pygeometriclearning/topology/topo_bench_config.py
Evaluation code: geometriclearning/play/topo_bench_play.py
The source tree is organized as follows: features in python/, unit tests in tests/,and newsletter evaluation code in play/.
To enhance the readability of the algorithm implementations, we have omitted non-essential code elements like error checking, comments, exceptions, validation of class and method arguments, scoping qualifiers, and import statements.
The installation relies on the uv package manager.

# 1. Setup uv if not installed
wget -qO- https://astral.sh/uv/install.sh | sh
      or
pip install uv
      or
brew install uv     # MacOS

# 2. Load the source code
git clone git@github.com:geometric-intelligence/topobench.git 
       or    
git clone http://geometric-intelligence/topobench.git
cd TopoBench

# 3. Setup the virtual environment
uv venv --python 3.11
source .venv/bin/activate

# 4. Sync dependencies with suitable version (torch ...)
uv sync --all-extras

Wrapper

The configuration of TopoBench for evaluating graphs or topological structures such as Simplicial Complexes can be tedious. Let’s automate and componentize the TopoBench functionality by wrapping it into a class, TopoBenchWrapper.

The default constructor takes 2 arguments:

graph_network: A simple graph model
topo_bench_descriptors: A tuple of several JSON-formatted descriptors for loading, transforming, splitting, training/optimizing and evaluating the model

Example of descriptors

transform_desc = {
   “khop_lifting”: {
       “transform_type”: “lifting”,
       “transform_name”: “HypergraphKHopLifting”,
       “k_value”: k_value
   }
}

evaluator_desc = {
   "task": "classification",
   "num_classes": 2,
   "metrics": ["accuracy", "precision", "recall", "f1"]
}

🔎 The class TopoBenchConfig described in the appendix automates the configuration of TopoBench from the list of descriptors.

The alternative constructor, build, loads a predefined, parameterized configuration of TopoBench defined in __get_config_descriptors . This implementation is an example of parameterization of the configuration for TopoBench using k_value for k-hop lifting, dataset data_name and the learning rate, lr used by the optimizer.

🔎 Training Implementation

With the necessary components assembled, we can now proceed to train the lightning_graph_model. This model is derived from the original PyTorch module via a conversion process detailed in the Appendix. The training configuration is defined by three primary hyperparameters:

max_epochs: The upper limit for training iterations.
float_precision: The bit-depth for weights and input data (16, 32, or 64-bit).
device_name: The specific hardware accelerator (CPU/GPU/TPU) used for computation.

Note: In torch Lightning, metrics are efficiently collected through call back - callback_metrics

📈 Evaluation

Datasets

PyTorch Geometric library contains a rich set of graph data that covers node classification, edge prediction and graph classification for graph structure with various homophily [ref 8].

TUDataset is a collection consists of over 120 datasets of varying sizes from a wide range of applications related to graph classification and regression. My evaluation uses two dataset collections: MUTAG and PROTEINS [ref 9].

MUTAG is a collection of 188 nitroaromatic chemical compounds. The primary goal is a binary classification task: predicting whether a given molecule has a mutagenic effect (specifically on the Salmonella typhimurium bacterium).

Positive Class: Mutagenic (harmful/toxic potential).
Negative Class: Non-mutagenic.
188 graphs (training & Evaluation)
7 discrete node features
4 discrete labels

PROTEINS dataset deals with much larger macromolecular structures, making it a more rigorous test for a model’s ability to handle scale and complexity. It determines whether a protein is an Enzyme or a Non-Enzyme.

Positive Class: Enzyme (catalyzes biochemical reactions).
Negative Class: Non-Enzyme (structural or signaling proteins)
1113 graphs
3 node features
No edge features

📌 I purposely selected the MUTAG dataset used in one of the tutorials of TopoBench so the reader can validate the results.

🔎 Training & Testing

The first step is to select a simple multi-layer perceptron model with two sets of linear layers with their associated activation functions.

hypernodes_linear: Module processing node features of the hypergraph, associated with the ReLU activation hypernodes_relu
hyperedges_linear: Module process edges features of the hypergraph, associated with the ReLU activation hyperedges_relu

Finally, let’s train and evaluate for these two data set with slightly different training parameters.

"data_domain": "graph",
"data_type": "TUDataset",
"data_name": "MUTAG",
"data_dir": "./data/MUTAG/"

💎 Evaluation of TopoBench model on Proteins dataset is available at Benchmarking Topological Deep Learning - Training & Test

📘 References

TopoBench: A Framework for Benchmarking Topological Deep Learning. L. Telyatnikov et All. 2025
Exploring Simplicial Complexes for Deep Learning: Concepts to Code - Hands-on Geometric Deep Learning, 2025
Graphs Reimagined: The Power of Cell Complexes - Hands-on Geometric Deep Learning, 2025
Exploring Hypergraphs with TopoX Library - Hands-on Geometric Deep Learning, 2025
Topological Lifting of Graph Neural Networks - Hands-on Geometric Deep Learning, 2025
TopoX: A Suite of Python Packages for Machine Learning on Topological Domains - M. Hajij et all, 2024
Taming PyTorch Geometric for Graph Neural Networks - Hands-on Geometric Deep Learning, 2025
PyTorch Geometric - Dataset Cheatsheet
TUDataset: A collection of benchmark datasets for learning with graphs C. Morris, N. Kriege, F. Bause, K. Kersting, P. Mutzel, M. Newmann, 2020

💎 Key takeaways, exercises and paper review is available in the original Substack article Benchmarking Topological Deep Learning

⏭️ Share in the comments the next topic you’d like me to tackle.

Patrick Nicolas has over 25 years of experience in software and data engineering, architecture design and end-to-end deployment and support with extensive knowledge in machine learning. He has been director of data engineering at Aideo Technologies since 2017 and he is the author of "Scala for Machine Learning", Packt Publishing ISBN 978-1-78712-238-3 and Hands-on Geometric Deep Learning Newsletter.

Benchmarking Topological Deep Learning

Patrick Nicolas

🎯 Overview

🎨 Modeling & Design Principles

Topological Domains

Topological Lifting

TopoBench

⚙️ Hands‑on with Python

Environment

Recommended by LinkedIn

Wrapper

🔎 Training Implementation

📈 Evaluation

Datasets

🔎 Training & Testing

📘 References

Geometric Learning in Python

5,739 followers

More articles by Patrick Nicolas

Others also viewed

The Evolution of Deep Learning: From Neurons to Intelligent Systems

From a Simple CNN to a Deep Network: Exploring the Power of Regularization in Deep Learning🧠

The Impact of GoogleNet: Advancing Image Recognition with Deep Learning

If image processing with AI help us to detect Cancer & Pneumonia why not using it for COVID-19.

Deep Learning: Powering The AI Revolution

Building an mini-batch AI-Powered OCR Application with Neural Networks from Scratch: NumPy vs TensorFlow, Part 1: The NumPy

Multi-Scale Context Aggregation by Dilated Convolution

How to Use Deep Learning-Based OCR: A Technical Deep-Dive Into Implementation

Deep Learning(ANN) with Tensor Flow made Simple..

Deep Learning: Image and Video Recognition

Explore content categories

🎯 Overview

🎨 Modeling & Design Principles

Topological Domains

Topological Lifting

TopoBench

⚙️ Hands‑on with Python

Environment

Recommended by LinkedIn

Wrapper

🔎 Training Implementation

📈 Evaluation

Datasets

🔎 Training & Testing

📘 References

Geometric Learning in Python

5,739 followers

More articles by Patrick Nicolas

Joint-Embedding Predictive Architecture Unpacked

Graphs Deserve Some Attention

The Mathematics of World Models

Visualization Tools for Geometric Deep Learning

2025 Retrospective Research Papers on Geometric Deep Learning

Curvature-informed Graph Learning

Turbocharging Deep Learning with Taichi

Understanding Data Through Persistence Diagrams

Data Modeling with Hypergraphs

Graphs Reimagined: The Power of Cell Complexes

Others also viewed

The Evolution of Deep Learning: From Neurons to Intelligent Systems

From a Simple CNN to a Deep Network: Exploring the Power of Regularization in Deep Learning🧠

The Impact of GoogleNet: Advancing Image Recognition with Deep Learning

If image processing with AI help us to detect Cancer & Pneumonia why not using it for COVID-19.

Deep Learning: Powering The AI Revolution

Building an mini-batch AI-Powered OCR Application with Neural Networks from Scratch: NumPy vs TensorFlow, Part 1: The NumPy

Multi-Scale Context Aggregation by Dilated Convolution

How to Use Deep Learning-Based OCR: A Technical Deep-Dive Into Implementation

Deep Learning(ANN) with Tensor Flow made Simple..

Deep Learning: Image and Video Recognition

Explore content categories