Machines/Automata
"It's
not what you are, but what you can effectively pretend to be.
[e.g., if computers couldn't pretend they were some type of machine
other than a finite state machine, they wouldn't even have had
the market penetration of castor oil. ]" (cfs)
|
|
|
Recall
that one of the initial motivations of the work on formal systems
was to place the idea of a proof on the simplest possible foundation.
If a proof could be done "mechanically," then we could
understand exactly what knowledge and capabilities were required
of the agent that carries out a proof. The goal, of course, was
to find the simplest "agent" possible. This work on
formal systems directly leads into thinking about computation.
Instead of the notion of proof, we simply generalize to the notion
of mechanically carrying out a set of instructions.
In considering thinking as computation,
we also need to understand the capabilities and resources that
a thinking agent must possess in order to realize thought. Again,
we hope that the agent can be simple....one that fits our intuitive
notion of mechanistic.
Automata theory or the mathematical
theory of computation is an important place to start because
it provides a way to precisely characterize what it means to
compute. And, once this precise characterization is available,
we can investigate the inherent limitations that follow from
this characterization.
Of the many different kinds of
automata, we will discuss only two in this section; namely, Finite
State Machines and Turing Machines. These types of machines represent
end points on an ordering of machines and therefore are an appropriate
choice for this introductory discussion. Before we examine these
types of machines in detail, we will note some of the characteristics
that these automata share. In particular, they both:
- have a fixed finite set of instructions,
or rules,
- there is a fixed finite set
of input and output symbols,
- at any point in time, the automaton
is in some distinct state, and
- the rules specify the conditions
under which the automaton transitions from its current state
to a new state.
These characteristics can be
better appreciated if we draw an analogy to a soda machine. Such
a machine has various mechanisms that serve to count the coins
put into the machine, dispense soda when appropriate (or fooled),
and provide change (hopefully). These mechanisms constitute the
rules that the machine possesses. The state of the machine consists
of the soda stored in it, the change stored in it, how much money
has been put into the coin slot, whether a selection has been
made, etc. Given some particular state, such as "75 cents
has been put in the coin slot, a selection has been made, ..."
the effect of a rule is to change the state of the machine, e.g.
"soda is dispensed, ...". One critical component missing
from the description of shared characteristics itemized above
is what causes a rule to be applied. We turn now to a description
of a Finite State Machine to show more exactly how one class
of automata operate.
|
Finite State Machines
A Finite State Machine (FSM)
is a quintuple <Q,I,O,M,G>. Q is a finite
set of states. I and O are finite sets
of input and output symbols, respectively. M
is the next state, or state transition, function
which maps Q x I into Q. G is the
output function that maps Q x I into O.
To make this clear consider some point in a computation. The
FSM is in some state say qk
where qk is an element of Q. Upon input i where
i is an element of I, the machine outputs some o = G(qk,i) and transitions to the new state ql
= M(qk,i).
|
| We
can represent an FSM several ways. Table 1 below shows a state
table representation of a simple Soda Machine. The first
entry in each row represents the current state of the machine.
The state set Q of this machine consists of the four
states {$.00, $.25, $.50, $.75}. At the top of each
column is listed the four types of inputs that this machine accepts.
This input set is {quarter, not-quarter, select, return},
or, in other words, the inputs are the inserting of a quarter
into the machine, the inserting of some coin other than a quarter,
the pressing of the soda selection button, or the pressing of
the coin return lever. The entries inside the table give the
next state of the machine followed by it output. The output set
is {accept (coin), reject (coin), give
(soda), dump (coins)}. The interpretation of an
entry in the table is, for example, if the present state of the
machine is $.00 (first row) and the input is quarter
(first column) then the machine's new state is $.25
and its output or action is to accept the coin. |
| |
| |
quarter |
not-quarter |
select |
return |
| $.00 |
$.25 / accept |
$.00 / reject |
$.00 / reject |
$.00 / reject |
| $.25 |
$.50 / accept |
$.25 / reject |
$.25 / reject |
$.00 / dump |
| $.50 |
$.75 / accept |
$.50 / reject |
$.50 / reject |
$.00 / dump |
| $.75 |
$.25 / reject |
$.75 / reject |
$.00 / give |
$.00 / dump |
Table 1. State Table Representation of Soda
FSM
|
|
|
In Figure 1 below, the same
machine is depicted as a directed graph. Here the nodes of the
graph (depicted as circles) represent distinct states and the
directed arcs (depicted as arrows) between nodes represent potential
transitions. Each arc is labeled with an input/output pair which
denotes the input which is associated with the transition and
the corresponding output that is generated. |
| |

Figure 1. Graphical Representation
of the Soda FSM |
|
| Now that we have an idea of how a Finite
State Machine works, we can consider the characteristics of a
finite state computation. Although it may not be apparent from
the soda machine, FSMs are capable of operating on infinitely
long strings of input symbols and generating infinitely long
strings of output symbols. For example, consider the FSMs depicted
in Figure 2. These machines are "recognizers";
they do not have outputs, rather they solve the problem of recognizing
input strings. In particular, the machines depicted at the top
of the figure will recognize, i.e., move into its accept state
at the end of input, when its input is a string of the form consisting
of n 0's followed by m 1's. In contrast to
that simply realized FSM, it is not possible to create an FSM
that will recognize strings that consist of n 0's followed
by n 1's! |
| |

Figure 2. Graphical Representation
of String Recognizers |
|
| Of
course, we can write an FSM that can recognizes specific instances
of such strings such as 01, 0011, and 000111. In fact, such specific
FSMs are depicted in Figure 2. And, as is also shown in this
figure, we can combine these machines to create a single FSM
that recognizes 01, 0011, 000111. However, doing this required
distinct states that demoted how many 0's had been recognized.
This is necessary because a state in an FSM is just a name, it
carries no additional information about how the computation reached
that state. Therefore, if there are two different paths by which
the computation reaches some state q and subsequent
computation depends on which path to q was taken then
q must be broken into two separate states. This is what
was done to create the combined FSM in Figure 2. There were separate
nodes that in effect denoted "seen 1 zero," "seen
2 zeros," and "seen 3 zeros." Thus, how many 1's
had to be recognized is encoded by the node from which the recognition
started. This counting of 0's has a limit, for there are only
a finite number of states in a Finite State Machine, and therefore,
it could not count to an arbitrarily large n. |
|
Nevertheless,
the lack of an infinite number of states is not the main problem.
Rather, it is the fact that the regularity of n 0's
followed by n 1's cannot be parsimoniously captured
or expressed within the finite state formalism, because the states
are just names that can't carry additional information concerning
how they were reached in the computation. If we adopt FSMs as
the basis for a cognitive model (i.e., our dumb mechanism) then
we best be certain that whatever regularities exhibited in human
cognition are expressible in our model. Otherwise our models
will not be parsimonious and we will be forced to argue that
regularities of the device's I/O (human cognition) are only reflected
in its I/O, not in its operation, much like our recognizer for
n 0's followed by n 1's where 0 < n
< 4.
One persuasive reason to expect
that regularities should be reflected in the operation of an
automaton, not just its I/O, has to do with how changes to the
function that an automaton calculates can be expressed and realized.
Imagine that we want to recognize strings for n 0's
followed by n 1's followed by b, where 0 <
n , 4 and b is a 0 or 1 depending on whether
n is even or odd. Thus, the strings the machine recognizes
are 011, 00110, and 0001111.
|
| |

Figure 3. Graphical Representation
of the String-Parity Recognizer |
|
| Such
a machine is depicted in Figure 3. As you may have suspected,
this machine has been built by modifying the combined machine
depicted at the bottom of Figure 2. Notice how the addition of
a parity bit has changed much of this machine's structure. Parity
recognition engendered breaking out 1's recognition into separate
computational paths. Thus the change, although it entails the
recognition of just one more bit, required global changes in
the machine's structure. In other words, the description of the
function, as captured in the finite state formalism, has changed
considerably. The task of realizing the change becomes complicated,
because apparently simple changes in the function that is computed
can entail large scale changes in the automaton. This is particularly
worrisome to the development of any theory of cognition based
on the finite state machine formalism. In addition, it is also
problematic if we are attempting to describe a device that changes
over time. A presumably human cognition changes over time ---
people learn! |
Turing Machines
|
Now
consider another kind of machine, a Turing Machine (TM). Turing
machines were first developed by Alan Turing to provide a formal
characterization for the informal concepts of algorithms and
functions computable by an algorithm. An algorithm is intuitively
defined as a finitely describable procedure that gives readily
interpreted step-by-step instructions for computing some function
in finite time.
As depicted in Figure 4 below,
a Turing Machine consists of three components:
- An infinite tape that is divided
into distinct cells that can hold one symbol;
- A Read/Write (R/W) Head that
can read or write the contents of the tape cell directly below
the head;
- and a Control Unit that controls
the Read/Write Head based on the unit's current state and the
content of the tape cell underneath the R/W Head.
Like an FSM, a Turing Machine
computation proceeds as a sequence of discrete steps. Again,
like an FSM, at any point in the computation, the control unit
can be in only one of a finite number of distinct states. Based
on the contents of the tape cell under the R/W Head and the current
state of the control unit, the unit will transition to one of
a fixed finite set of new states and take one of three possible
actions. These actions are:
- write any one of a fixed finite
set of tape symbols on the tape cell under the R/W Head;
- move the tape one tape cell
to the right;
- move the tape one tape cell
to the left.
|
| |

Figure 4. Pictorial Representation
of a Turing Machine (TM). |
|
|
The
behavior of a Turing Machine is described by a set of rules,
or quadruples, of the form < q, t, a, f > . So,
for instance, if q is the present state of the control
unit and t is the symbol in the tape cell under the
R/W Head, then the control unit takes action a and transition
to state f. Any finite set of such rules describes a
Turing Machine as long as no two rules have the same q
and the same t. This is known as the consistency condition.
It insures that given some state of the Control Unit and symbol
under the R/W Head, the Control Unit will not have alternative
applicable quadruples; there will be at most one quadruple whose
q matches the state and whose t matches the
tape symbol. Two consequences follow from this. The machine operates
deterministically, and, further, at some point in the computation,
there may be no applicable rules, in which case the machine is
said to have halted.
To use a TM as a computational
device and thus to associate a function with its operation, we
have to fix where its input and output are on the tape. Accordingly,
the input is presumed to be immediately to the right of the head
at the beginning of the computation and the output can be found
immediately to the right of the head when, and if, the machine
halts.
Now that we have described FSMs
and TMs, we can ask how they compare. Clearly, at the level of
their formal description, they are quite similar. In particular,
the TM's Control Unit is very similar to an FSM. It can be in
only one of a finite number of distinct states and there is a
finite number of distinct input symbols. And based on the current
state and the input (what is under the R/W Head), the unit makes
a transition to a unique new state and takes one of a finite
number of distinct actions, the latter being in essences the
Control Unit's output symbol.
This similarity in description
is also reflected in the underlying capabilities. Any function
computable by an FSM is computable by a Turing Machine. But there
is a major difference in that a TM's input and output are to
an infinite tape, and, in particular, the device's previous output
can be its subsequent input. The significance of this is that
the device can not manipulate symbols in special ways. In particular,
the previous history of the computation can be brought to bear
on the present steps of the computation. For instance, we can
write a TM that recognizes n 0s followed by n
1s for any n > 0 written on its tape. Roughly, each
time the machine sees a 0, it can go over to another section
of the tape and place a mark to denote that it has seen a 0.
It thereby maintains a running tally. Subsequently, when 1 recognition
begins, it can move the head over to the tally marks and erase
one tally mark (e.g., write a blank) for each 1 that is read.
Thus the machine can keep track of the number of 0s seen and
insure that there are the same number of 1s. Indeed, it can do
this for infinitely long strings because the tape is infinite.
Moreover, it can do this for any n > 0 using the same uniform
machine description, or quadruples. It's ability to do this rests
on the fact that the machine has procedures to store these tally
symbols, this string of 1s, and it also has the procedures that
appropriately interpret the significance of this string when
it comes time to perform 1s recognition. That is, the TM can
store and interpret symbolic structures.
|
An
alternative way to code this task in a Turing Machine would be
to sequentially intersperse the erasure of 0s and 1s. (Click
here for an illustration of this alternative)This exploits
the fact that the representation of 0s and 1s on the tape can
be used to encode how many there are. However, the somewhat more
subtle distinction realized in this approach between representation
and what a representation can encode or mean obscures certain
points that we will subsequently make
|
|
|
So,
there are functions computable on a Turing Machine which are
not computable on an FSM. Unlike an FSM's computational history
which must be captured in its current state symbol, a Turing
Machine's computational history includes the infinite tape's
contents. This allows arbitrary dependencies between the past
computational steps and the current steps.
These arbitrary dependencies
have certain consequences. In particular, the current step of
a Turing Machine's computation has a more complicated description.
It is described by the contents of the tape, the current state
of the Control Unit, and the location of the R/W Head. This information
is the machine's instantaneous description (see the bottom of
Figure 4)). However, this instantaneous description is somewhat
problematic, because it presumes that the starting state and
history of a computation requires recourse to a description of
an infinite tape, hardly a parsimonious basis for effective theory
formation. And again, this is a problem for the scientist who
may want to use the formalism to describe a cognitive theory.
And, it is also a problem for the device itself if that device
is going to change, in some regular fashion, either the function
it computes or how it computes. That is, it is a problem if learning
is presumed to be part of the device's capabilities.
|
Turing Machines, Universal Turing
Machine, Computability and Complexity
|
In this section, we step away
from our search for a mechanistic account in order to all too
briefly discuss two important concepts, complexity and computability.
The Turing Machine represents
one formal characterization of an algorithm. Other characterizations
have been proposed (e.g., Post's Productions and Church's Lambda
Calculus). Yet all these characterizations have been shown to
be equivalent in terms of their ability to compute the same functions.
Church's Thesis goes further to state that any characterization
will be equivalent. In other words, the Turing Machine has appropriately
captured the intuitive notion of algorithm and thus effectively
defines the class of functions computable by algorithm. This
thesis can never be proven because an intuitive notion of algorithm
is inherently informal and thus make a formal proof impossible.
Nonetheless, the thesis is widely accepted as being true.
Recall that Turing machines have
finite descriptions; they are based on finite sets of tape and
state symbols from which a finite set of quadruples is defined.
It follows from this that all possible TMs can be effectively
enumerated. In effect, we can place them in some order or list
(e.g. T0, T1, ...) and
name a machine by its place or index in the order. This can be
done even though there are infinite number of such machines,
much as we can place the Natural numbers in a fixed order, (1,
2, 3, 4,...). Although we will not demonstrate it, it also follows
from this that a universal Turing Machine (Tuniversal)
exists which can compute the same function of any other Turing
Machine given the other machine's index. So if GTn is
the functions associated with the nth Turing Machine then GTuniversal(n,x) = Gn(x). Given the index n, a universal machine
could use it to determine Tn, and then it could use Tn
to compute GTn(x).
To get a rough idea of how one
Turing machine, Tuniversal, might actually simulate another Turing
machine, Tk, consider what information would be
useful for deriving the next instantaneous description given
the present instantaneous description of Tk's
computation. First, there is the present instantaneous description
and, secondly, there are the quadruples that will determine the
next instantaneous description. Given such information, how might
we discern the next instantaneous description?
- Get Tk's
current state, qcurrent, and current tape symbol under the R/W
Head, tcurrent,
from the present instantaneous description.
- Look among the quadruples for
one that is applicable, in effect one of the form <qcurrent, tcurrent, a, f >.
- If you find such a quadruple
then use a and
f to update the instantaneous
description.
- If you don't find such a quadruple
then the machine halts and the instantaneous description remains
the same.
Starting with any instantaneous
description of Tk, this procedure could repeatedly be
applied to successive instantaneous descriptions to simulate
the operation of Tk. The machine's universality, or programmability,
depends on its ability to store and interpret another machine's
quadruples, initial tape configuration, and the instantaneous
descriptions which ensue. That is to say, the ability to simulate
hinges on the ability to manipulate symbols. And based on this
symbol manipulation, the above sketch of a procedure gives a
finite set of readily interpreted step-by-step instructions for
performing the simulation. Further, note that in the sense that
this procedure obeys our intuitive notion of algorithm then there
must be a Turing Machine, a Universal Turing Machine, that realizes
it if we accept Church's Thesis. Nevertheless, the reader should
consider more carefully what such a Universal Turing Machine
would look like and how it would operate as it performed these
various steps. For instance, how would it uniformly represent
any Turing machine's input, output, and quadruples? How would
it keep track of qcurrent
and tcurrent
when looking for the applicable
quadruple and how would it actually test for applicability? What
seems like simple steps in our algorithm, would tend to be rather
involved operations on a Turing Machine, requiring many secondary
operations not laid out in the four steps above. And the sequence
of successive instantaneous descriptions of our universal machine
as it simulated another machine would reflect these involved
operations, obscuring any appearance of correspondence to the
instantaneous descriptions of the machine being simulated. As
a practical matter, therefore, people don't use Turing machines
as a basis for describing particular algorithms or models.
But the benefit of Turing Machines
does not reside in any ability to express an algorithm and its
operation in an easily understood form. Rather, it is the simplicity
of the formal characterization itself that is desirable, e.g.,
there are finite sets of states and tape symbols, there are only
three actions that the Control Unit can take, etc. It is this
simplicity which assures us that it is a mechanistic account
and simplifies the task of showing equivalence between Turing
Machines and other formal characterizations.
We can also use the Turing Machine
as a yardstick to measure just how complex it is to solve a particular
kind of problem. We can ask how many steps a Turing Machine will
take in order to solve the problem of computing a particular
function. In the case of the problem of recognizing n
0s followed by n 1s, we could ask how many steps it
would take for our TM to perform the recognition. Well, it turns
out that the number of steps it takes is a polynomial function
of the "size or length" of the input, n (in effect,
each bit i has to be scanned, and the distance to the
tally mark is a linear function of i, so the actual
steps for all bits is the summation of i which roughly makes
the total number of steps < or = to c times n
squared for some constant c.). In such a case, we say
the algorithm is of polynomial time complexity where "time"
is marked by the number of steps.
Now, it turns out that we can
talk about the complexity of a problem regardless of the algorithm
employed to solve the problem (i.e., compute the function). Moreover,
many problems require more than polynomial time (e.g., n
to the k power for some k > 1). There are
classes of problems (which go by names such as NP-complete) for
which there is no known algorithm which can in general solve
those problems in polynomial time. And, it seems unlikely that
there is any polynomial time algorithm for these problems, because
such classes are built on the requirement that problems in the
class are "reducible" or translatable to each other
in such a way that having a polynomial time algorithm for one
means having a polynomial time algorithm for the others in the
class. Years of research has lent credence to the belief that
no such algorithm exists. Indeed such problems may require exponential
time (e.g. k to the n power). The difficulty
here is that as n becomes larger, the time time required for
such algorithms "explodes". For this reason, such problems
are considered intractable.
Intuitively, the importance of
this notion of intractable problems is that it means that some
problems cannot be easily solved. It does not mean that solving
some instance of the problem need also in intractable. For example,
imagine the problem of recognizing whether some English sentence
is grammatically correct. Now, this is a trivial problem for
particular sentences in English. We can just remember whether
a particular sentence is grammatical. And it is also trivial
for a finite subset of English sentences, we can do it in a way
similar to our FSM recognizer of n 0s followed by n
1 for 0 < n < 4. But for the entire English language
it is not a trivial problem and a general solution to the problem
may not be tractable. But here we can get into a finer grain
analysis of the complexity of a particular algorithm which may
perform well on some sentences but on others, the worse cases,
still be intractable.
Finally, we can ask whether there
are problems which are simply not computable, regardless of the
time. We saw that certain functions which are computable on a
TM are not computable on an FSM. That is the problem of computing
the function "n 0s followed by n 1s regardless
of the size of n" is not solvable on an
FSM. But are there problems that are not solvable on a TM? Well,
consider the problem of determining whether a TM ever halts.
Specifically, consider a TM defined whose finite control is a
single quadruple (q0,#,L,q0). Thus, if the R/W Head is scanning
the blank symbol, #, in state q0
then the control unit will move the tape left and take on stat
q0
again. Now, if the entire
tape to the right of the starting position is blank, the machine
will never halt. And, since there are only a finite number of
tape symbols, we could define a similar quadruple to the one
above for each permissible tape symbol as well.
In such simple cases, it is not
hard to determine whether the machine halts. But , for more complex
Turing machines it may not be so simple to determine. In general,
we would like to know if there is some Turing computable function
that will tell us whether any particular TM halts. In other words,
could one build a TM that when given the description of any TM
on its input tape determines whether that other machine halts
on some input.
It turns out that the answer
is no. So this halting function is one function which is not
computable by a Turing Machine.
|
FSMs, TMs, and Computers
TMs and FSMs are in sharp contrast
along the dimension of characterizing the current state of the
computation. The FSM has a finite set of unique simple state
names that must nevertheless capture any necessary distinctions
in the computation history of the device. By virtue of the state
names being a finite set, this severely restricts the dependencies
that can be realized between present and past steps of the computation.
TMs are, in essence, at the other extreme whereby an infinite
tape allows arbitrary dependencies between any two steps in a
computation to be realized. To help elucidate the nature of this
difference, consider another familiar finite state machine, a
digital computer. Now it may be surprising that a digital computer,
whether it is some super computer or some personal computer,
is an FSM. But the architecture of a computer can be decomposed
into logical devices that perform various boolean and arithmetic
operations on data and storage circuits (e.g., registers, cache,
main memory, disk storage, etc.) that store data. Data, in the
vast majority of digital computers, is represented as binary
(i.e., base 2) numbers where each digit is either a 1 or a 0.
Thus , the number we know as zero would be represented as 0,
and a one would be represented as 1, but two is represented as
10 and three as 11, by virtue of it being a base 2 representation
of numbers. However, the actual physical devices in the computer
represent the digits (or bits) of base 2 arithmetic (e.g., 1
or 0) by being in one of two states, lets call these states On
and Off. So the operation of a 1 being added to 0 would
be realized by some physical device in the computer making a
state transition from being Off to being On.
There is a fixed number of all
these various storage circuits and devices so even if we considered
all possible ways to assign 1 or 0 to these devices, the entire
computer at any point could be in only a fixed (but typically
very large) number of distinct states. And of course, this is
an upper bound on the number of states, since some of these states
would violate the internal logic devices, which in effect determine
which states are possible. Moreover, given any one of these states,
there is a fixed number of distinct states to which it can transition,
determined by the logical devices, and the conditions under which
it transitions to any of these new states must be a constant.
For it to be otherwise would mean that the computer was not reliable.
In other words, any function
computable by a computer is computable by an FSM. However, the
above low level description of computers does them a disservice.
Although, the functions computable by any computer belongs to
the class of Finite State Machines, HOW a computer computes is
more similar to a Turing Machine. Much of the computer's circuitry
and memory is used to insure that the computer can be described
and used at a "higher" level, for instance, by describing
in a computer language a computation to be carried out. This
additional memory encodes the data, and computations or program to
be carried out on that data. And other parts of the computer
encode procedures and logic for translating the program into
a particular sequence of state changes by which the low level
physical device realizes the computation. As an example, consider
the following little "program" to realize our n 0s
followed by n 1s string recognizer. It is written in English,
but would be trivial to realize in any number of high level computer
languages. (in fact, far more eloquently than it is expressed
here in English.)
"Read the string one bit at a time,
from left to right. As long as you have just seen zeros, increment
a running tally by one for each zero. But as soon as you read
a one, proceed to decrement the running tally as long as you
continue to read one, the tally is not negative, and part of
the string remains to be read. If the string ends with the tally
equal to zero then accept the string, else reject it."
Much like out Turing Machine
program, this little program expresses an algorithm for recognizing
n 0s followed by n 1s regardless of the size
of n. It has steps which imply procedures for generating.storing
the tally and procedures which interpret it. Together these procedures
place an interpretation on what the symbols, e.g. 1s and 0s that
comprise the tally mean. However, this increase in expressiveness
has come at a cost. Although this program expresses an algorithm
for arbitrary n, when realized on any particular computer n would
be bounded because eventually there would be insufficient
distinct states (i.e., memory) in which to represent the running
tally of zeros seen. The tally would bet too large to store.
So the function that can be computed in still limited by the
number of states. Yet, now we are dedicating some of the computer's
devices (in particular memory space) to be used for the program,
procedures, and logic that translate it into something that controls
the sequence of state transitions. Fewer devices means fewer
distinct states. Furthermore, state transitions themselves cannot
be arbitrary because that would mean the interpretation of the
program was arbitrary. Or in other words, transition between
any two arbitrary states may now have to be realized by a
sequence of state transitions. But remember, the number of different
states our computer could be in is an absolute upper bound on
the function it can compute.
Our discussion of FSM began by
showing them to be relatively unstructured devices - states were
just names and we could define any transition desired, that is,
given any input symbol we could define (or draw in the case of
the graph) a transition between any two states. But with the
computer there is considerable structure imposed on the states,
for within those states are stored programs and procedures for
interpreting programs. In addition, transitions are no longer
arbitrary. This structure reduced the number of states available
to a computation from what it would have been given some raw
enumeration of possible states based on the memory bits and logical
devices. As in the case of the FSM recognizer of n 0's
followed by n 1 for 0 < n < 4, the number
of states available constrains the function that can be computed.
The tradeoff is between what
functions can be computed and how functions are computed. As
a consequence, there is a considerable gain in expressibility,
a gain that is not to be undervalued. In our little program we
have expressed the general case for a recognizer of a string
of n 0s followed by n 1s. And in an actual
computer language, this could be expressed even more parsimoniously.
For many reasons, it will turn out the parsimonious expression
is critical. For one, parsimonious expression of such a regularity may
well be a methodological requirement. As we shall see when we
discuss natural language, Chomsky argued for the existence of
similar kinds of regularities in human language usage. Moreover,
it may well be that the device under study, human cognition,
is capable of generating and employing such parsimonious expressions
much like our computer and uses its program to control its state
transitions and thereby compute the function. For clearly, we
could be given such a program and proceed to follow it steps
in order to compute the function.
|
|
|