DFA Minimization techniques

Mahmoud Ashour

Published Jun 13, 2020

Introduction :

In the theory of computation, a deterministic finite automaton (DFA)—which is also known as deterministic finite acceptor (DFA) or deterministic finite-state automaton (DFSA)—is a finite state machine that either accepts or rejects a given symbols, Through a state sequence determined by the given symbols .The Deterministic refers to the uniqueness of the computation run.

What is Minimum DFA

DFA minimization means converting a given deterministic finite automata to its equivalent DFA but with a minimum number of states. Apparently, two DFAs are called equivalent ,if and only if they recognize the same regular language. For each regular language, there also exists a minimal automaton that accepts it, that is, a DFA with a minimum number of states and this DFA is unique. The minimal DFA makes sure to achieve the minimal cost for tasks assigned. In DFA , There are two main types of states that can be removed or merged from the original DFA without affecting the language.The first is Unreachable states which are states that are not reachable from the initial state of the DFA. The second is Non Distinguishable states which are those that cannot be distinguished from one another for any input. DFA minimization is done in three steps, which equals the removal or merger of the relevant states. Since the elimination of non distinguishable states is computationally the most expensive one, it is usually done as the last step.

Minimization of Reachable & Unreachable states

The Unreachable states occur in the state p of a DFA M=(Q, Σ, δ, q0, F) ,if no string w in Σ* exists where p=δ(q0, w). Where Q is the set of states, Σ is the set of symbols that were input, δ is the transition function which is mapping a state and an input symbol to a set of various states, δis its extension to strings, q0 is the initial state of DFA, and F is the set of final states. The Reachable states can be obtained with the following algorithm:

let reachable_states := {q0};

let new_states := {q0};

do {

temp := the empty set;

for each q in new_states do

for each c in Σ do

temp := temp ∪ {p such that p = δ(q,c)};

end;

new_states := temp \ reachable_states;

reachable_states := reachable_states ∪ new_states;

} while (new_states ≠ the empty set);

unreachable_states := Q \ reachable_states;

Unreachable states then can be removed normally from the DFA without affecting the language that it accepts.

Hopcroft's algorithm

Regarding the non distinguishable states , an algorithm for merging the of a DFA belongs to the American computer scientist Hopcroft, is based on dividing the DFA states into several groups by their behavior which is called partition refinement. These groups act whereby every two states of the same partition are equal if and only if they have the same behavior for all the input sequences. Which means, for every two states p1 and p2 that belong to the same class within the partition P, and every input word w, the transitions which are determined by w should always take states p1 and p2 to equal states, states that equally both accept, or states that both reject and that represent equivalence classes of the Myhill–Nerode equivalence relation,.

The following pseudocode describes the algorithm:

P := {F, Q \ F};

W := {F, Q \ F};

while (W is not empty) do

choose and remove a set A from W

for each c in Σ do

let X be the set of states for which a transition on c leads to a state in A

for each set Y in P for which X ∩ Y is nonempty and Y \ X is nonempty do

replace Y in P by the two sets X ∩ Y and Y \ X

if Y is in W

replace Y in W by the same two sets

else

if |X ∩ Y| <= |Y \ X|

add X ∩ Y to W

else

add Y \ X to W

end;

Explanation of the algorithm : It starts with a partition that is too occupied where each pair of states which are equal belong to the same set in the partition, but pairs that are inequivalent might also belong to the same set , at each step it splits sets of states into pairs of subsets that are necessarily inequivalent. The initial partition is a separation of the states into two subsets of states that clearly do not have the same behavior as each other: the accepting states and the rejecting states.The algorithm then chooses a set A from the current partition and a symbol c, and splits each of the sets of the partition into two partitions which can possibly be empty subsets which occur repeatedly, The subset of states that lead to A on input symbol c, and the subset of states that do not lead to A. Since set A is already known to have different behavior than the other sets of the partition, the subsets which lead to A also have different behavior than the subsets that do not lead to A. The algorithm terminates when no more splits of this type can be found.

Moore’s algorithm

On the other hand , the american professor Edward F. Moore worked on an algorithm which is like Hopcroft's algorithm, it also maintains a partition that starts off separating the accepting from the rejecting states, and repeatedly refines the partition until no more refinements can be made. Every step it replaces the current partition with the coarsest common refinement of s + 1 partitions, one of which is the current one and the others are the preimages of the current partition under the transition functions for each of the input symbols. When this replacement does not change the current partition the algorithm terminates. The algorithm’s time complexity is O(n2s) where each step of the algorithm can be performed in time O(ns) using a radix sort to reorder the states .The instances of the DFA minimization problem that cause the worst-case behavior are the same as for Hopcroft's algorithm. The number of steps that the algorithm performs can be much smaller than n, so on average time complexity its performance is O(n log n) or can achieve O(n log log n) depending on the random distribution on automata chosen.

The following algorithm mimics the Moore algorithm for DFA minimization.

We define an initial partition P of Q into classes of states Se as follow:

∀e∈Π : Se={q∈Q∣γ(q)=e}

∀e∈Π:Se={q∈Q∣γ(q)=e}

Then we split the classes in P as follows:

repeat successively for each class of states S until none changes

repeat ∀a∈Σ

If ∃S∈P,∀q∈S,δ(q,a)∈S ∃S′∈P,∀q∈S,δ(q,a)∈S′ then do nothing

else split S into subsets Si such that

for each subset Si there is a different class S′ ∈ P such that

∀q∈Si ,δ(q,a)∈S′

(where the subsets Si replace S’ in P)

Conclusion :

These algorithms which both try to achieve the same thing , minimization of a finite automata which is extremely useful in making the compilers execute faster, as it removes identical operations. Merging states like this should produce a smaller automaton that accomplishes exactly the same task as our original one.

Running a program which can run a second or two faster than other programs, can vastly increase the scope of the program. Therefore minimization is very essential in the process.

Resources :

Aho, Alfred V.; Hopcroft, John E.; Ullman, Jeffrey D. (1974), "4.13 Partitioning", The Design and Analysis of Computer Algorithms, Addison-Wesley, pp. 157–162.

Hopcroft, John E.; Ullman, Jeffrey D. (1979), Introduction to Automata Theory, Languages, and Computation, Reading/MA: Addison-Wesley, ISBN 978-0-201-02988-8

Câmpeanu, Cezar; Culik, Karel, II; Salomaa, Kai; Yu, Sheng (2001), "State Complexity of Basic Operations on Finite Languages", 4th International Workshop on Automata Implementation (WIA '99), Lecture Notes in Computer Science, 2214, Springer-Verlag, pp. 60–70,

Kameda, Tsunehiko; Weiner, Peter (1970), "On the state minimization of nondeterministic finite automata", IEEE Transactions on Computers, 100 (7): 617–627,

Hopcroft, John E.; Ullman, Jeffrey D. (1979), Introduction to Automata Theory, Languages, and Computation, Reading/MA: Addison-Wesley

Knuutila, Timo (2001), "Re-describing an algorithm by Hopcroft", Theoretical Computer Science, 250 (1–2): 333–363

Moore, Edward F. (1956), "Gedanken-experiments on sequential machines", Automata studies, Annals of mathematics studies, no. 34, Princeton, N. J.: Princeton University Press, pp. 129–153.

To view or add a comment, sign in

DFA Minimization techniques

Mahmoud Ashour

Hopcroft's algorithm

More articles by Mahmoud Ashour

Others also viewed

Linearization — Making Nonlinear Uncertainty Tractable

Chua's Double Scroll Attractor as a Hardware Entropy Source for Guided Entropy Principle Systems

Algorithm complexity. Asymptotic analysis. How to check the complexity of your code.

Air Traffic Coordination Algorithm

Can LLMs help better allocate your resources?!

The 78-Year Wait Is Over: Machines Are Finally Breeding (And It’s Bullish for America)

The Multi-Head Advantage — Parallelizing "Thought"

Physics-informed RAG for reliable LLMs (skip reflexive fine-tuning)

The Transformer Architecture of Modern AI

E-Discovery Tip Sheet: Logic

Explore content categories

Hopcroft's algorithm

More articles by Mahmoud Ashour

The Barid & DNS story

Telecom Rating Engines: The history and business

Dirty state — The whole story

Acorn RISC Machine

Sorting & Knowledge database in prolog

Scrum vs Kanban

Others also viewed

Linearization — Making Nonlinear Uncertainty Tractable

Chua's Double Scroll Attractor as a Hardware Entropy Source for Guided Entropy Principle Systems

Algorithm complexity. Asymptotic analysis. How to check the complexity of your code.

Air Traffic Coordination Algorithm

Can LLMs help better allocate your resources?!

The 78-Year Wait Is Over: Machines Are Finally Breeding (And It’s Bullish for America)

The Multi-Head Advantage — Parallelizing "Thought"

Physics-informed RAG for reliable LLMs (skip reflexive fine-tuning)

The Transformer Architecture of Modern AI

E-Discovery Tip Sheet: Logic

Explore content categories