A Use-Case for Self-referential Structures and Recursion in RDBMSs

Evan Howlett

Published Jul 1, 2025

The Elegance of Recursion

Recursion gets a bad rap, but it's perfect for certain problems. Let me show you one. It’s a technique that allows us to solve complex problems by breaking them into smaller, self-similar pieces. In functional programming, recursion isn’t just a tool -- it’s a way of thinking. At the heart of this paradigm lies the linked list, a recursive data structure that perfectly encapsulates the essence of functional design. But what makes linked lists so special, and how can their recursive nature be applied to real-world problems, like modeling an organizational hierarchy in a database? We’ll explore the beauty of linked lists, debunk myths about recursion’s performance, and demonstrate how to leverage recursive structures in SQL to represent something practical.

The Linked List: The Poster Child of Functional Programming

In functional programming, recursive structures are ubiquitous. If I were to crown one data structure as the quintessential representation of functional programming, it would undoubtedly be the linked list. Why? Because its recursive nature aligns seamlessly with the principles of immutability, composition, and simplicity that are fundamental to functional programming.

Defining a Linked List

To understand the linked list, let me try to visualize its definition. In pseudocode, a linked list can be defined as follows:

type LinkedList = 
    State1 => (Value, LinkedList)
    or State2 => End

This definition describes a linked list as a type with two possible states:

State1: A node that contains a value and a reference to another linked list (the "next" node).
State2: The end of the list, often called the "nil" or "empty" state.

In essence, a linked list is either a node holding a value and pointing to another linked list or a marker indicating the list’s end. This recursive definition allows the structure to grow indefinitely while maintaining a simple, elegant design.

For example, consider how we might instantiate a linked list with three values:

let x = State1(1, State1(2, State1(3, State2)))

This creates a linked list containing the values 1, 2, and 3. Visually, it can be represented as:

State1(1) -> State1(2) -> State1(3) -> State2

Or, in a more detailed breakdown:

First Element: State1(1, pointer)
                           |
                           v
Second Element: State1(2, pointer)
                           |
                           v
Third Element: State1(3, pointer)
                           |
                           v
End: State2

From Theory to Practice

Now that we’ve established what a linked list is, I'd like to address a common misconception about recursion before exploring its practical application. Many developers shy away from recursion, believing that it's inherently slow or resource-intensive.

Is Recursion Really Slow?

A common criticism of recursion is that it’s slower and more memory-intensive than iterative approaches. This perception isn’t entirely baseless, but it’s not inherently true either.

In functional programming, recursion is often implemented using tail recursion. This is a technique that can be employed when the recursive call is the last operation in the function. When a compiler or runtime supports tail-call optimization (TCO), it can transform a tail-recursive function into assembly code that’s equivalent to a loop. This means:

No additional stack frames: The call stack doesn’t grow with each recursive call, preventing stack overflow.
Comparable performance: The generated machine code is as efficient as an iterative loop, with no difference in performance or memory usage.

Unfortunately, not all programming languages or runtimes implement TCO. Popular languages like Python or JavaScript (in some environments) lack this optimization, leading to the perception that recursion is slow. This is an implementation detail, not a fundamental flaw of recursion itself. In languages like Haskell, Scheme, or certain JavaScript runtimes (e.g., Node.js with specific configurations), tail-recursive functions can perform just as well as their iterative counterparts.

Because of these tooling limitations, many developers avoid recursion, missing out on its elegance and clarity. This creates a blind spot in design pedagogy, especially for problems where recursion is not just a solution but the most natural solution. So, what’s a real-world problem where recursion shines?

Modeling an organizational hierarchy!

Recommended by LinkedIn

Harmonizing Data Symphony: Choosing Between R and SQL…

Salmane Koraichi 2 years ago

OOP & SQL: Building Your Core IT Concepts

L. F. N. 1 year ago

SAS Viya empowers open-source

Deep Jariwala 2 years ago

A Real-World Use Case: Modeling an Org Chart

Imagine you’re tasked with representing a company’s organizational structure in a database. Most organizations follow a hierarchical model: employees report to managers, who report to higher-level managers, and so on, until you reach the CEO or company head. This hierarchy resembles a chain:

Employee -> Manager -> Manager’s Manager -> CEO

Does this look familiar? It’s strikingly similar to our linked list:

State1(value) -> State1(value) -> State1(value) -> State2

This similarity suggests that a recursive data structure, like a linked list, is a natural fit for modeling an org chart. By representing each employee as a node and their manager as the next node in the chain, we can traverse the hierarchy recursively to query relationships, such as “Who is my boss’s boss?” or “Who reports to this manager?”

Bringing Recursion to the Database

Now that we’ve identified the org chart as a recursive problem, how do we implement it in a database? SQL, the language of relational databases, might not seem like an obvious candidate for recursion, but modern SQL standards support recursive queries through Common Table Expressions (CTEs).

Implementing the Org Chart in SQL

To represent an organizational hierarchy, we need a table that captures the recursive relationship between employees and their managers. Here’s a simplified schema in SQL-like pseudocode:

CREATE TABLE Employee (
    id INT PRIMARY KEY,
    name TEXT,
    manager_id INT REFERENCES Employee(id)
);

This table has three columns:

id: A unique identifier for each employee.
name: The employee’s name.
manager_id: A foreign key referencing the id of the employee’s manager. If the employee has no manager (e.g., the CEO), this field is NULL.

Let’s populate the table with a small example:

INSERT INTO Employee (id, name, manager_id) VALUES
    (1, 'John', NULL),  -- CEO, no manager
    (2, 'Jack', 1),     -- Reports to John
    (3, 'Jake', 2);     -- Reports to Jack

This data represents a simple hierarchy: Jake reports to Jack, who reports to John, the CEO. Visually, it’s:

Jake -> Jack -> John (CEO)

To query the entire hierarchy, we can use a recursive CTE to traverse the chain from the top (the CEO, where manager_id is NULL) down to the lowest-level employee:

WITH RECURSIVE org AS (
    -- Base case: Start with the employee who has no manager (CEO)
    SELECT id, name, manager_id
    FROM Employee
    WHERE manager_id IS NULL
    
    UNION ALL
    
    -- Recursive case: Join employees with their managers
    SELECT e.id, e.name, e.manager_id
    FROM Employee e
    INNER JOIN org o ON e.manager_id = o.id
)
SELECT id, name, manager_id
FROM org;

This query works as follows:

Base case: Select the employee with no manager (John, the CEO).
Recursive case: For each employee in the org CTE, find employees who report to them by joining on manager_id = id.
Union: Combine the base and recursive cases, repeating the recursive step until no more employees are found.
Result: Output the hierarchy in order, from CEO to subordinates.

Running this query on our example data would produce:

id name manager_id 
1 John NULL 
2 Jack 1 
3 Jake 2

This result mirrors the linked list structure, with each row representing a node and the manager_id acting as the pointer to the next node (or NULL for the end).

Our simple example assumes a single reporting chain, but real organizations are often more complex, with multiple levels and branches (e.g., teams, departments, locations). How does our recursive model scale to these scenarios? Unfortunately, that's a question for another article. You'll just have to tune in next time to find out!

Rick Marshall BSc BE 10mo

OK. Shoot me. This is just inaccurate and wrong. I can't even start to list the issues. And from someone who has made an art form out of linked lists.

3 Reactions

Ihe Onwuka 10mo

Self referential data structures are natural and powerful - and are a much more natural fit for things that devs use arrays for. Recursive programming and self-referential data structures do not belong in an RDBMS. Render to Caesar what is Caesars - not this.

1 Reaction

Kevin Ferron 10mo

How many RDBMS’s I’ve seen built out to normalization that end up needing a tree after having built static categorization? Pretty much all of them that became mature systems

2 Reactions

Ron Thompson 🦀 10mo

I'd never allow it in production. It's far too easy to miss the base case. If you're writing functional code in some languages, then you have no choice, but that's another reason to not write functional code in a language that forces you to do this. *All* forms of recursion can be expressed with an explicit stack, unrolled in a loop. Which both disallows the "messed up base case" AND allows you to more easily tune the memory usage/stack depth limits.

A Use-Case for Self-referential Structures and Recursion in RDBMSs

Evan Howlett

The Elegance of Recursion

The Linked List: The Poster Child of Functional Programming

Defining a Linked List

From Theory to Practice

Is Recursion Really Slow?

Recommended by LinkedIn

A Real-World Use Case: Modeling an Org Chart

Bringing Recursion to the Database

Implementing the Org Chart in SQL

More articles by Evan Howlett

Others also viewed

Why Every Data Aspirant Should Embrace Object-Oriented Programming for Clean Code

Understanding Data Structures in C: A Comprehensive Guide

SAS Viya for the Open-Source Programmer

Why use Python instead of Excel?

Data virtualisation in Python

Choosing the Right Programming Language for Your Data Engineering Pipeline: Python vs SQL

SQL and Python

Correlation Matrix, analyzing data using Power BI and Python: a comparative analysis.

Why Python should be the preferred programming language for your Big Data Analytics investments.

Explore content categories

The Elegance of Recursion

The Linked List: The Poster Child of Functional Programming

Defining a Linked List

From Theory to Practice

Is Recursion Really Slow?

Recommended by LinkedIn

A Real-World Use Case: Modeling an Org Chart

Bringing Recursion to the Database

Implementing the Org Chart in SQL

More articles by Evan Howlett

Scaling Up the Org Chart: From Simple Chains to Flexible Units

Others also viewed

Why Every Data Aspirant Should Embrace Object-Oriented Programming for Clean Code

Understanding Data Structures in C: A Comprehensive Guide

SAS Viya for the Open-Source Programmer

Why use Python instead of Excel?

Data virtualisation in Python

Choosing the Right Programming Language for Your Data Engineering Pipeline: Python vs SQL

SQL and Python

Correlation Matrix, analyzing data using Power BI and Python: a comparative analysis.

Why Python should be the preferred programming language for your Big Data Analytics investments.

Explore content categories