Finding the Minimal FSM
|
A
minimal FSM, M, is one that uses the fewest possible states to
recognize a language L(M). The intuitive idea is that the elimination
from an FSM of all states which are equivalent will result in
the minimal FSM.
Roughly
speaking, two states are equivalent if they both take you to
the same place. The algorithm for finding the minimal FSM involves
partitioning the states of the given FSM (recall that a partition
of a set is one where each member of the set is in some subset
of the set and the subsets are disjoint) into states that satisfy
this notion of equivalence.
|
 |
| The
algorithm is illustrated by using it to find the minimal FSM
for the FSM shown in the figure in the above right. The main
purpose in providing an illustration of this algorithm is to
demonstrate the sense in which global information about
an FSM is required to construct a minimal FSM. |
|
The
table to the right illustrates the steps involved in the algorithm.
The first column simply numbers the states (or partitions) of
the new machine as they are determined. The second column lists
the partition at a particular level in the the partitioning (a
partitioning is a set of partitions where each partition
is a refinement of the parent partition). The third column displays
the transition matrix that provides the basis for determining
the next level of partitioning that may be required for the partition
shown in the corresponding row of column 2.
The
algorithm begins with an initial partition of the states that
places all final (or accepting) states in one subset and all
non final (or non accepting) states in another subset. These
two subsets are shown in column 2 of rows 1 and 2. The rightmost
column provides the transition table for each of these subsets.
This table is constructed for each state in the partition. These
are listed in column 1 of the Transition Tables. The cell entries
in columns 2 through 4 of the table are constructed as follows.
For the row state and each column input element, determine the
partition of which its target state is an element. For example,
consider the Transition Table that corresponds to partition 02.
q0 is the first state listed. On input element a,
the transition is to state q1 which is an element of
partition 02. Therefore, 02
is entered in this cell.
After
this is done for each state in the partition The Transition Table
has been completed. An examination of the Transition Table corresponding
to partition 02 immediately reveals which states are
in the same equivalence class at this level of the partitioning.
Two states are in the same equivalence class if their cell entries
are identical. In this and subsequent transition tables differing
shades of gray are used to visually group states which are equivalent.
The three different shades of gray in the transition table for
02 accent the three equivalence classes
that are formed from partition 02. These three partitions are shown in
the following rows and labeled 1021,
1022,
1023.
|
Trace of the Algorithm for Identifying
the Minimal FSM
| States |
Partition |
Transition Table
for Partition |
| 1 |
01 = {qhalt} |
|
| |
02 = {q0,q1,q2,q3,q4,q5,q6,q7,q8,q9} |
| State |
a |
b |
# |
| q0 |
02 |
|
|
| q1 |
02 |
02 |
|
| q2 |
|
|
01 |
| q3 |
02 |
02 |
|
| q4 |
|
02 |
|
| q5 |
|
|
01 |
| q6 |
|
02 |
|
| q7 |
|
02 |
|
| q8 |
|
02 |
|
| q9 |
|
|
01 |
|
| 2 |
1021 = {q0} |
|
| |
1022
= {q1,q3} |
| State |
a |
b |
# |
| q1 |
1022 |
1023 |
|
| q3 |
1024 |
1024 |
|
|
| 3 |
1023 = {q2,q5,q9} |
|
| |
1024
={q4,q6,q7,q8} |
| State |
a |
b |
# |
| q4 |
|
1023 |
|
|
q6 |
|
1024 |
|
|
q7 |
|
1024 |
|
|
q8 |
|
1023 |
|
|
| 4 |
210221 = {q1} |
|
| 5 |
210222 = {q3} |
|
| 6 |
210241 = {q4,q8} |
|
| |
210242
= {q6,q7} |
| State |
a |
b |
# |
| q6 |
|
210242 |
|
|
q7 |
|
210241 |
|
|
| 7 |
32102421 = {q6} |
|
| 8 |
32102422 = {q7} |
|
|
|
The
first number in the label indicates the level of partitioning,
the middle numbers in small print refer to the previous partition
that is being partitioned, and the last number refers to the
number of the new partition within the partitioning. Thus, for
label 1022,
1 is the next level of partitioning after
0, 02 is the parent partition, and 2 is the
number used to refer to the new partition.
A
partition label is shown in red if it is a terminal partition;
that is, if it can not be partitioned further. The partitioning
continues until no more partitioning is possible. In this case,
8 equivalence classes are discovered implying that the minimum
number of states required for an FSM to recognize the language:
{ab, aabb,aaabbb} is
eight.
|
 |
| The
partitioning that results is also shown as a tree in the figure
abovet. Again, the terminal partitions are shown in red and of
course appear as the bottom nodes of the tree. |
|
The
resulting minimal machine is shown to the right. The stare names
used in this depiction of the minimal FSM do not correspond to
the names used in the original machine. The correspondence is
between state names and partitions. In this diagram partition
1021
corresponds to q0, 210221
to q1,
1023 to
q2, 210222 to
q3, 210241 to
q4, 3210241 to
q5,3210242 to
q6,and 01 to qhalt.
Note,
that the algorithm that identified this machine examined the
entire set of rules; and the algorithm required that the learning
be completed. No simple extension of the "Learning FSMs"
considered earlier could identify this machine.
|
 |
|
This
minimal machine seems to capture the regularity in the input
strings better than the machines learned by the "Learning
FSMs". To illustrate this, consider the sets of input strings
AB = {ab,aabb,aaabbb}, the
set that we have examined, as well as the sets A = {aa,
aaaa,aaaaaa} and B =
{ab,aacc,aaaddd}. The Basic
"Learning FSM" (B_LFSM) required 14 control states
to recognize AB. It would also require 14 control states to recognize
set A and 14 control states to recognize set B. Further, the
graph of each of these machines is identical for each of these
input sets.
The
Extended "Learning FSM" (E_LFSM) required 11 control
states to recognize AB. It would also require 11 control states
to recognize set B but only 8 control states to recognize set
A .The graphs of the machines for AB and B are identical.
The minimal machine for AB required 8 control states. A minimal
machine for B could also require 8 control states but the graph
for these minimal machines would be quite different. And, the
minimal machine for set B would require 11 control states; the
same as required by the FSM learned by the E_LFSM.
Although each of these
types of machines:
- the FSM learned by the B_LFSM,
- the FSM learned by the E_LFSM,
and
- the corresponding minimal FSM,
can each recognize the same input
sets; the way the set is represented in the structure of the
machine can be quite different.
What
should we require of an "intelligent" learner?
Simply to recognize the input strings or to in some sense represent
the regularities that might be present in the input set? And,
what should we expect of the human learner? Do we simply
recognize the input strings or do we typically represent the
regularities that we find in the input set? And finally, in what
sense does even a minimal FSM represent the regularity? Is it
enough to implicitly capture regularities in the form of the
machine or should these regularities be represented in a more
explicit sense? These are difficult questions to which we will
return when we take up human and machine learning.
Finally,
what exactly is the regularity that we see in the set,
{ab,aabb,aaabbb}? Is a
string of a's followed
by an equal number of b's? Or,
is it the base string ab with
0, 1, or 2 ab strings
recursively embedded within the base string? If you recall the
particular Turing Machine that was
constructed to recognize the strings a**n b**n,
you might have noted that this latter
regularity seems to be the one implicit in that machine.
|
|
|