DFA Minimization techniques

Introduction : 


In the theory of computation, a deterministic finite automaton (DFA)—which is also known as deterministic finite acceptor (DFA) or deterministic finite-state automaton (DFSA)—is a finite state machine that either accepts or rejects a given symbols, Through a state sequence determined by the given symbols .The  Deterministic refers to the uniqueness of the computation run.


What is Minimum DFA


DFA minimization means converting a given deterministic finite automata to its equivalent DFA but with a minimum number of states. Apparently, two DFAs are called equivalent ,if and only if they recognize the same regular language. For each regular language, there also exists a minimal automaton that accepts it, that is, a DFA with a minimum number of states and this DFA is unique. The minimal DFA makes sure to achieve the minimal cost for tasks assigned. In DFA , There are two main types of states that can be removed or merged from the original DFA without affecting the language.The first is Unreachable states which are states that are not reachable from the initial state of the DFA. The second is Non Distinguishable states which are those that cannot be distinguished from one another for any input. DFA minimization is done in three steps, which equals the removal or merger of the relevant states. Since the elimination of non distinguishable states is computationally the most expensive one, it is usually done as the last step.

Minimization of Reachable & Unreachable states

The Unreachable states occur in the state p of a DFA M=(Q, Σ, δ, q0, F) ,if no string w in Σ* exists where p=δ*(q0, w). Where Q is the set of states, Σ is the set of symbols that were input, δ is the transition function which is mapping a state and an input symbol to a set of various states, δ*is its extension to strings, q0 is the initial state of DFA, and F is the set of final states. The Reachable states can be obtained with the following algorithm:

let reachable_states := {q0};

let new_states := {q0};

do {

    temp := the empty set;

    for each q in new_states do

        for each c in Σ do

            temp := temp ∪ {p such that p = δ(q,c)};

        end;

    end;

    new_states := temp \ reachable_states;

    reachable_states := reachable_states ∪ new_states;

} while (new_states ≠ the empty set);

unreachable_states := Q \ reachable_states;

Unreachable states then can be removed normally from the DFA without affecting the language that it accepts.

Hopcroft's algorithm


Regarding the non distinguishable states , an algorithm for merging the of a DFA belongs to the American computer scientist Hopcroft, is based on dividing the DFA states into several groups by their behavior which is called partition refinement. These groups act whereby every two states of the same partition are equal if and only if they have the same behavior for all the input sequences. Which means, for every two states p1 and p2 that belong to the same class within the partition P, and every input word w, the transitions which are determined by w should always take states p1 and p2 to equal states, states that equally both accept, or states that both reject and that represent equivalence classes of the Myhill–Nerode equivalence relation,.

The following pseudocode describes the algorithm:

P := {F, Q \ F};

W := {F, Q \ F};

while (W is not empty) do

     choose and remove a set A from W

     for each c in Σ do

          let X be the set of states for which a transition on c leads to a state in A

          for each set Y in P for which X ∩ Y is nonempty and Y \ X is nonempty do

               replace Y in P by the two sets X ∩ Y and Y \ X

               if Y is in W

                    replace Y in W by the same two sets

               else

                    if |X ∩ Y| <= |Y \ X|

                         add X ∩ Y to W

                    else

                         add Y \ X to W

          end;

     end;

end;

 

Explanation of the algorithm : It starts with a partition that is too occupied where each pair of states which are equal belong to the same set in the partition, but pairs that are inequivalent might also belong to the same set , at each step it splits sets of states into pairs of subsets that are necessarily inequivalent. The initial partition is a separation of the states into two subsets of states that clearly do not have the same behavior as each other: the accepting states and the rejecting states.The algorithm then chooses a set A from the current partition and a symbol c, and splits each of the sets of the partition into two partitions which can possibly be empty subsets which occur repeatedly, The subset of states that lead to A on input symbol c, and the subset of states that do not lead to A. Since set A is already known to have different behavior than the other sets of the partition, the subsets which lead to A also have different behavior than the subsets that do not lead to A. The algorithm terminates when no more splits of this type can be found.

 

Moore’s algorithm 


On the other hand , the american professor Edward F. Moore worked on an algorithm which is like Hopcroft's algorithm, it also maintains a partition that starts off separating the accepting from the rejecting states, and repeatedly refines the partition until no more refinements can be made. Every step it replaces the current partition with the coarsest common refinement of s + 1 partitions, one of which is the current one and the others are the preimages of the current partition under the transition functions for each of the input symbols. When this replacement does not change the current partition the algorithm terminates. The algorithm’s time complexity is O(n2s) where each step of the algorithm can be performed in time O(ns) using a radix sort to reorder the states .The instances of the DFA minimization problem that cause the worst-case behavior are the same as for Hopcroft's algorithm. The number of steps that the algorithm performs can be much smaller than n, so on average time complexity its performance is O(n log n) or can achieve O(n log log n) depending on the random distribution on automata chosen.

The following algorithm mimics the Moore algorithm for DFA minimization.

We define an initial partition P of Q into classes of states  Se as follow:

e∈Π : Se={qQγ(q)=e}

∀e∈Π:Se={q∈Q∣γ(q)=e}

Then we split the classes in P as follows:

repeat successively for each class of states S until none changes

   repeat  ∀a∈Σ

     IfS∈P,∀qS,δ(q,a)∈S ∃S′∈P,∀q∈S,δ(q,a)∈S′ then do nothing

     else split S into subsets Si such that 

for each subset Si there is a different class S′ ∈ P such that 

qSi ,δ(q,a)∈S

       (where the subsets Si replace S’ in P)





Conclusion : 


These algorithms which both try to achieve the same thing , minimization of a finite automata which is extremely useful in making the compilers execute faster, as it removes identical operations. Merging states like this should produce a smaller automaton that accomplishes exactly the same task as our original one.

Running a program which can run a second or two faster than other programs, can vastly increase the scope of the program. Therefore minimization is very essential in the process.




Resources :


Aho, Alfred V.; Hopcroft, John E.; Ullman, Jeffrey D. (1974), "4.13 Partitioning", The Design and Analysis of Computer Algorithms, Addison-Wesley, pp. 157–162.

Hopcroft, John E.; Ullman, Jeffrey D. (1979), Introduction to Automata Theory, Languages, and Computation, Reading/MA: Addison-Wesley, ISBN 978-0-201-02988-8

Câmpeanu, Cezar; Culik, Karel, II; Salomaa, Kai; Yu, Sheng (2001), "State Complexity of Basic Operations on Finite Languages", 4th International Workshop on Automata Implementation (WIA '99), Lecture Notes in Computer Science, 2214, Springer-Verlag, pp. 60–70, 

Kameda, Tsunehiko; Weiner, Peter (1970), "On the state minimization of nondeterministic finite automata", IEEE Transactions on Computers, 100 (7): 617–627, 

Hopcroft, John E.; Ullman, Jeffrey D. (1979), Introduction to Automata Theory, Languages, and Computation, Reading/MA: Addison-Wesley

Knuutila, Timo (2001), "Re-describing an algorithm by Hopcroft", Theoretical Computer Science, 250 (1–2): 333–363

Moore, Edward F. (1956), "Gedanken-experiments on sequential machines", Automata studies, Annals of mathematics studies, no. 34, Princeton, N. J.: Princeton University Press, pp. 129–153.


To view or add a comment, sign in

More articles by Mahmoud Ashour

  • The Barid & DNS story

    While surfing the internet, we rarely think about what happens behind the scenes when we type a website name like…

  • Telecom Rating Engines: The history and business

    Telecommunications is considered all sorts of communication between users. Television, radios, phones , messages…

  • Dirty state — The whole story

    Being a Flutter developer second and a computer science student first makes observing details mandatory. This article…

  • Acorn RISC Machine

    Introduction Acorn RISC Machine (ARM) is a bunch of reduced instruction set computing (RISC) architectures of computer…

  • Sorting & Knowledge database in prolog

    Introduction All data in Prolog are represented by Prolog terms. Every term is either a variable or a compound term…

  • Scrum vs Kanban

    Introduction : The term “Agile” means 'moving quickly'. The Agile software development methodology is based on an…

Others also viewed

Explore content categories