Replacing Semaphores into DAG with Reservation tables to decrease complexity, allowing easy certification of highly complex computerized machines.

Replacing Semaphores into DAG with Reservation tables to decrease complexity, allowing easy certification of highly complex computerized machines.


Article content
Here is the LinkedIn version of our scientific publication about DAG conversion to deterministic reflexes.

We (Athos Silicon) can demonstrate that a directed acyclic graph (DAG), such as one used to represent task dependencies or instruction scheduling in computer architecture, can be transformed into a suite of deterministic loops. We will use graph theory and the concept of reservation tables (from computer architecture and instruction pipelining) to demonstrate how this transformation maintains dependencies and determinism.


🧠 Key Concepts

1) Graph Theory provides a powerful framework for modeling and analyzing the execution flow of ROS (Robot Operating System) nodes or middleware runnables in embedded and distributed systems. In this context, each node (in ROS) or runnable (in AUTOSAR-like systems) can be represented as a vertex in a directed graph, while edges represent dependencies, message flows, or triggering relationships between them.

Key Graph Concepts Applied:

  • Directed Acyclic Graph (DAG): Most ROS or middleware systems use a DAG to ensure predictable data and control flow. A DAG ensures there are no cyclic dependencies, which is essential for deterministic scheduling and deadlock avoidance.

Article content

  • Topological Sorting: By performing a topological sort on the graph, the system can determine a valid execution order for nodes or runnables that respects all data and control dependencies. This is critical for correct real-time behavior and coordination between components.
  • Critical Path Analysis: Identifying the longest path through the DAG helps detect latency bottlenecks and prioritize optimizations in high-performance or safety-critical systems.
  • Strongly Connected Components (SCCs): If cyclic behavior is allowed (e.g., feedback control loops), SCCs help analyze which components are interdependent and require careful timing or synchronization.

This abstraction allows designers to transform complex ROS systems or middleware into analyzable, schedulable, and verifiable execution models suitable for autonomous systems, robotics, and real-time safety-critical software.

2) Reservation Table: A reservation table is a key structure in out-of-order (OoO) execution engines, enabling instructions to be issued and executed as soon as their operands are ready, rather than strictly in program order. Each entry in the reservation table corresponds to a functional unit (e.g., ALU, FPU) and temporarily holds instructions waiting to be executed. When an instruction is decoded, it is placed into the reservation table along with metadata such as operand availability and destination register. If an operand is not yet available (e.g., waiting on a previous instruction), the reservation station tracks the tag of the producing instruction. Once the operand becomes ready typically signaled by the common data bus (CDB) the reservation station updates its state. When all operands are ready and the corresponding functional unit is free, the instruction can be issued for execution. This mechanism allows for non-blocking execution, exploiting instruction-level parallelism (ILP) and maximizing throughput by avoiding stalls due to data hazards. Additionally, it enables dynamic scheduling, where the actual execution order can differ from the program order without violating correctness. This flexibility is central to the performance advantages of OoO cores, particularly in complex workloads with many interdependent instructions and irregular data patterns.

3) Deterministic loops are loops whose behavior output, timing, and state transitions is completely predictable and repeatable for a given input. This predictability means that the loop always executes the same number of iterations and accesses memory or hardware resources in a known, structured way. Determinism is critical in systems requiring reliability, safety, and verification, such as in real-time embedded systems, functional safety, and high-assurance computing. We call those deterministic loops "reflexes".

Article content
Reflexes simplify computing and complexity

  1. In contrast to dynamic or data-dependent loops, deterministic loops do not rely on unpredictable conditions or external inputs to determine how long they run. Instead, the control flow and resource usage are fully known at compile time or analyzable at design time. This allows for formal verification, timing analysis, and schedulability guarantees.
  2. For example, a deterministic loop may iterate over a fixed-size array with no branches depending on runtime data. This allows tools or engineers to guarantee worst-case execution time (WCET), cache access patterns, or power consumption crucial for systems like autonomous vehicles, avionics, or robotics, where timing and reliability are non-negotiable.
  3. In hardware-aware software design, deterministic loops map efficiently to pipelines and SIMD units, and can be analyzed using reservation tables, allowing the architecture to achieve high throughput with formal guarantees of correctness and performance.


🧩 Step-by-Step Transformation

1. Represent the Computation as a DAG

Let G=(V,E), where:

  • Each node vi ∈ V is a task or instruction.
  • Each directed edge (vi,uj) ∈ E means vj depends on ui.

This captures all data and control dependencies.


2. Compute a Topological Sort

Use Kahn's algorithm or DFS-based topological sort to linearize the graph:

Article content


This ensures that no instruction is scheduled before its dependencies.


3. Create Deterministic Loops

From the topological order, you can unroll the DAG into one or more loops, where each loop:

  • Executes a subset of independent tasks in parallel.
  • Advances in time respecting dependency delays.

Loop Form:

for (int t = 0; t < T; ++t) {
    execute_op(v_i);  // only if all dependencies are satisfied
}        


Each execute_op(v_i) happens only when dependencies are met controlled by loop-carried conditions or predicates derived from the graph edges.


4. Build the Reservation Table

For each node v, determine:

  • When it can be executed (based on when its dependencies are complete).
  • Which resources (e.g., ALU, Memory) it needs.

Use a reservation table:

  • Rows = time cycles
  • Columns = resources
  • Mark 1 if resource is used by vi at that time.

Ensure:

  • No resource conflicts
  • No dependency violations

Example:

Article content

5. Formal Proof (Sketch)

Using graph theory:

  • Since G is a DAG, topological sort exists.
  • Every node has a finite dependency distance from its predecessors.
  • A deterministic schedule exists such that: ∀(u,v)∈E, schedule(v)>schedule(u)
  • With fixed time steps and finite resources, and the help of the reservation table, we can construct loop-based execution with known behavior and no hazards.

Thus, G → Deterministic Loops is a valid transformation.


📦 Resulting Execution Model

You get:

  • A schedule for the graph using a loop construct.
  • A reservation table ensuring deterministic timing and safe resource use.
  • The equivalent of static instruction scheduling used in compilers and VLIW architectures.


📘 Applications

  • Compiler optimizations (e.g., software pipelining)
  • Hardware synthesis (e.g., HLS tools)
  • Task scheduling in real-time OS or embedded systems
  • Dataflow architectures

Let's analyze the Big-O complexity improvement when transforming a computation represented as a dependency DAG into a suite of deterministic loops with reservation tables.


🎯 Goal

We want to estimate the speedup or complexity improvement of:

Executing a dependency graph serially (naive execution) vs. Executing it as scheduled deterministic loops using reservation tables (parallelized, pipelined execution)

🧠 Assumptions and Definitions

Let:

  • G=(V,E): A DAG with ∣V∣=n nodes (tasks) and dependencies E.
  • Tserial: Time to execute all n tasks serially (naively).
  • Tparallel: Time to execute with deterministic loops guided by reservation tables.
  • p: Number of independent tasks that can be executed in parallel per cycle (determined by the DAG and available resources).
  • depth(G): The critical path length of the graph — the longest dependency chain.


🧮 Serial Execution Complexity

A naive approach (like depth-first traversal):


Article content

Each node is executed one after the other, regardless of whether it could have been done earlier.


🔁 Deterministic Loop Execution with Reservation Table

This enables:

  • Parallel execution of independent tasks
  • Proper scheduling of dependent tasks
  • Avoidance of hazards and stalls

So, the complexity is:

Article content

Why?

  • You must at least wait for the critical path to complete.
  • Execution time is determined by how many tasks can be executed per cycle and how dependencies align.


📊 Big-O Improvement

Let’s define:

  • w=width(G): The maximum number of independent tasks at any level (a proxy for available parallelism).
  • r: Resource bound per cycle (e.g., number of ALUs).

Then:

Article content

So, the theoretical improvement is:


Article content

This is the speedup gained by converting to deterministic loops with dependency-aware scheduling.


🚀 Example

Let’s take:

  • n=1000 tasks
  • depth(G)=100
  • w=100 (max parallelism per level)
  • r=10 resources (e.g., 10 ALUs)

Then:

Article content

📘 Summary


Article content

In practice, the actual complexity is bounded by both the DAG's structural parallelism and available hardware resources.

And of course, all of those Apparatus are patented.

To view or add a comment, sign in

More articles by François Piednoel de Normandie

Others also viewed

Explore content categories