Encoding of Turing Machines and
the Universal Turing Machine
|
We
have presented a variety of Turing Machines (AnBn
machine and AnBnCn machine)
and in the essay on Machines
and Automata briefly discussed the important idea of a Universal
Turing Machine. Recall that a Universal Turing Machine (U) is
one that can take as input any Turing Machine (T) together with
T's input and provide the same result as T. Now there are an
infinite number of Turing Machines that can be specified. (Any
particular Turing Machine is of finite length, n; but
the set of such machines is infinite.) Note that once we write
down a Turing Machine its structure; the sets Q and S,
the start state, Q0 and the set of Halt states,
as well as the Tuples are fixed. The fixed structure that
defines a Universal Turing Machine must be capable of interpreting
any particular Turing Machine, T. Now, T may have finitely but
arbitrarily large Q and S, and Tuples.
The
trick, of course, is to devise a language within which any T
can be expressed. In this section we will
- illustrate the encoding of the
AnBn machine and AnBnCn
machine into a common language;
- and consider the possible use
of this encoding as another type of measure of the complexity
of a problem,
The development given below is
adapted from the text: Formal Languages and Automata Theory
by Vladimir Drobot. Computer Science Press: Rockville, MD, 1989.
A brief discussion of the encoding of Turing Machines may also
be found in: Introduction of Automata Theory, Languages, and
Computation by John E. Hopcroft and Jeffrey D. Ullman. Addison-Wesley:
Reading, MA, 1979.
The figure below reviews the
definition of a Turing Machine that will be used in this development.
|
| |
 |
|
| |
Definition of a Turing Machine |
|
Turing Machine Encoding
Next, we consider an input vocabulary,
S, for
the Universal Turing Machine, U, that is sufficient for encoding
any Turing Machine, T.
The fixed vocabulary consists
of 11 symbols:
- {1 ,0} out of which are constructed
names for each state and each symbol;
- R, L, and S to name the moves
of the tape head;
- 6 additional symbols that are
used as separators in the strings that describe the Turing Machine,
T and the starting tape of T.
The string that describes a Turing
Machine is of the form:
<seperator1><Nstates><seperator2><Nsymbols><seperator3><list
of the tuples><seperator4>
where in our application:
<seperator1> :=: Ý
<Nstates> :=: the number of states of the machine in binary
where 1 is
assigned to the start state and Nstates is assigned to the halt
states
<seperator2> :=: !
<Nsymbols> :=: the number of tape symbols of the machine
in binary where 1 is
assigned blank symbol. (# is the blank symbol in our application)
<seperator3> :=: @
<list of the tuples> is encoded as:
<statei><separator5><symboln><separator5><statej><separator5><symbolm><move><separator6>
where:
<statei> :=: the binary name of the state
<symboln> :=: the binary name of the tape symbol
<statej> :=: the binary name of the state
<symbolm> :=: the binary name of the tape symbol
<move> :=: R,L,or S
<separator5> :=: X
<separator6> :=: Y
<seperator4> :=:¦
Note that the order of the terms
that define a tuple is somewhat different from the order used
in our examples. Here the order is:
current state, current tape symbol,
new state, new tape symbol, action
rather than
current state, current tape symbol, new tape symbol, action ,
new state
The leftmost portion of the starting
tape is encoded as:
<Tseparator1><Nsymbols><seperator2><celli><Tseparator2><cellj>...<Tseparator2><celln><Tseparator3>
where:
<Tseparator1> :=: W
<Nsymbols> :=: as above
<seperator2> :=: as above
<celli> :=: the binary name of the tape symbol in the leftmost
cell of the tape
<Tseparator2> :=: %
<cellj>... <celln> :=: the binary names of the tape
symbols up to the last symbol on the tape
<Tseparator3> :=: $
|
The
table below provides the tuples of the AnBn Machine
considered earlier. To the right of this table is the encoding
of this machine into the fixed alphabet described above.
|
| |
Encoding of the AnBn Turing Machine
Tuples of the AnBn Machine
| current
control state |
symbol
under read head |
new
symbol |
action |
new
control state |
| q0 |
A |
# |
R |
q2 |
| q2 |
A |
A |
R |
q2 |
| q2 |
B |
B |
R |
q2 |
| q2 |
# |
# |
L |
q3 |
| q3 |
B |
# |
L |
q4 |
| q4 |
A |
A |
L |
q4 |
| q4 |
B |
B |
L |
q4 |
| q4 |
# |
# |
R |
q1 |
| q1 |
A |
# |
R |
q2 |
| q1 |
# |
# |
S |
qh |
|
The Number of States
= 6 which in binary is 110
The Number of Symbols = 3 which in binary is 11
The mapping between state names and binary names is:
q0 > 1, q2 > 10, q3 >
11, q4 > 100,
q1
> 101, qh >
110
The mapping between symbols and binary names is:
#
> 1, A > 10, B > 11
And thus mapping for each tuple is:
Tuple: q0
A # R q2 --> 1X10X10X1XRY
Tuple: q2 A A R q2 --> 10X10X10X10XRY
Tuple: q2 B B R q2 --> 10X11X10X11XRY
Tuple: q2 # # L q3 --> 10X1X11X1XLY
Tuple: q3 B # L q4 --> 11X11X100X1XLY
Tuple: q4 B B L q4 --> 100X11X100X11XLY
Tuple: q4 A A L q4 --> 100X10X100X10XLY
Tuple: q4 # # R q1 --> 100X1X101X1XRY
Tuple: q1 A # R q2 --> 101X10X10X1XRY
Tuple: q1 # # S qh
--> 101X1X110X1XSY |
|
|
| |
This TM encoded as a string
is:
Ý110!11@
1X10X10X1XRY10X10X10X10XRY10X11X10X11XRY10X1X11X1XLY11X11X100X1XLY100X11X100X11XLY
100X10X100X10XLY100X1X101X1XRY101X10X10X1XRY101X1X110X1XSY¦
where the different colorings
of the elements of the string are used to help you see the structure
of the machine encoded in this vocabulary.
The Length of the Encoded TM is 149
For this example we considered
the input string #AB which is encoded as W11!1%10%11$
Thus the input to a U might be
the string:
W11!1%10%11$1X10X10X1XRY10X10X10X10XRY10X11X10X11XRY10X1X11X1XLY11X11X100X1XLY
100X11X100X11XLY100X10X100X10XLY100X1X101X1XRY101X10X10X1XRY101X1X110X1XSY¦
The Length of the Encoded Start String is 12
The Total length is 161
|
|
| |
The
Next table below provides the tuples of the AnBnCn Machine
considered earlier. To the right of this table is the encoding
of this machine into the fixed alphabet described above. |
|
| |
Encoding of the AnBnCn Turing
Machine
Tuples of the AnBnCn TM
| current
control state |
symbol
under read head |
new
symbol |
action |
new
control state |
| q0 |
A |
# |
R |
q2 |
| q2 |
A |
A |
R |
q2 |
| q2 |
B |
B |
R |
q2 |
|
q2 |
C |
C |
R |
q2 |
|
q2 |
% |
% |
R |
q2 |
| q2 |
# |
# |
L |
q5 |
|
q5 |
C |
# |
L |
q3 |
|
q3 |
C |
C |
L |
q3 |
|
q3 |
% |
% |
L |
q3 |
| q3 |
B |
% |
L |
q4 |
| q4 |
B |
B |
L |
q4 |
| q4 |
A |
A |
L |
q4 |
| q4 |
# |
# |
R |
q1 |
| q1 |
A |
# |
R |
q2 |
|
q1 |
# |
# |
R |
q6 |
|
q1 |
% |
% |
R |
q6 |
|
q6 |
% |
% |
R |
q6 |
| q6 |
# |
# |
S |
qh |
|
The Number of States = 8 which
in binary is 1000
The Number of Symbols = 5 which in binary is 101
The mapping between state names and binary names is:
q0 > 1, q2 > 10, q5 >
11, q3 > 100,
q4
> 101, q1 >
110, q6 > 111, qh > 1000
The mapping between symbols and binary names is:
#
> 1, A > 10, B > 11, C > 100,
% > 101
And thus mapping for each tuple is:
Tuple: qo A # R q2 --> 1X10X10X1XRY
Tuple: q2 A A R q2 --> 10X10X10X10XRY
Tuple: q2 B B R q2 --> 10X11X10X11XRY
Tuple: q2 C C R q2 --> 10X100X10X100XRY
Tuple: q2 % % R q2 --> 10X101X10X101XRY
Tuple: q2 # # L q5 --> 10X1X11X1XLY
Tuple: q5 C # L q3 --> 11X100X100X1XLY
Tuple: q3 C C L q3 --> 100X100X100X100XLY
Tuple: q3 % % L q3 --> 100X101X100X101XLY
Tuple: q3 B % L q4 --> 100X11X101X101XLY
Tuple: q4 B B L q4 --> 101X11X101X11XLY
Tuple: q4 A A L q4 --> 101X10X101X10XLY
Tuple: q4 # # R q1 --> 101X1X110X1XRY
Tuple: q1 A # R q2 --> 110X10X10X1XRY
Tuple: q1 # # R q6 --> 110X1X111X1XRY
Tuple: q1 % % R q6 --> 110X101X111X101XRY
Tuple: q6 % % R q6 --> 111X101X111X101XRY
Tuple: q6 # # S qh --> 111X1X1000X1XSY
|
|
|
| |
This TM encoded as a string is:
Ý1000!101@1X10X10X1XRY10X10X10X10XRY10X11X10X11XRY10X100X10X100XRY10X101X10X101XRY10X1X11X1XLY
11X100X100X1XLY100X100X100X100XLY100X101X100X101XLY100X11X101X101XLY101X11X101X11XLY
101X10X101X10XLY101X1X110X1XRY110X10X10X1XRY110X1X111X1XRY110X101X111X101XRY111X101X111X101XRY
111X1X1000X1XSY¦
where the different colorings
of the elements of the string are used to help you see the structure
of the machine encoded in this vocabulary.
The Length of the Encoded TM is 288
For this example we considered
the input string #ABC which is encoded as W101!10%11%100%1$
The Length of this encoded input
string is 17.
Thus the input to a U might be
the string:
W101!10%11%100%1$1X10X10X1XRY10X10X10X10XRY10X11X10X11XRY10X100X10X100XRY10X101X10X101XRY
10X1X11X1XLY11X100X100X1XLY100X100X100X100XLY100X101X100X101XLY100X11X101X101XLY101X11X101X11XLY
101X10X101X10XLY101X1X110X1XRY110X10X10X1XRY110X1X111X1XRY110X101X111X101XRY111X101X111X101XRY
111X1X1000X1XSY¦
The Total length is 305.
|
|
| |
The
final table below provides the tuples of the AnBn Machine
where n takes on the values 1 though 6. This machine
is restricted to be realizable as a Finite State Machine. It
is included here to provide an example of the use of this encoding
as a measue of the "complexity" of the machine. To
the right of this table is the encoding of this machine into
the fixed alphabet described above. Note that the length of this
machine is 323 while the length of the entirely general AnBn
Turing Machine was 161! |
|
| |
Tuples of the AnBn Machine, n=1-6
| current
control state |
symbol
under read head |
new
symbol |
action |
new
control state |
| q0 |
A |
A |
R |
q1 |
| q1 |
A |
A |
R |
q2 |
| q2 |
A |
A |
R |
q3 |
| q3 |
A |
A |
R |
q4 |
| q4 |
A |
A |
R |
q5 |
| q5 |
A |
A |
R |
q6 |
| q6 |
B |
B |
R |
p1 |
| p1 |
B |
B |
R |
p2 |
| p2 |
B |
B |
R |
p3 |
| p3 |
B |
B |
R |
p4 |
| p4 |
B |
B |
R |
p5 |
| p5 |
B |
B |
R |
p6 |
| p6 |
# |
# |
R |
qh |
| q1 |
B |
B |
R |
p6 |
| q2 |
B |
B |
R |
p5 |
| q3 |
B |
B |
R |
p4 |
| q4 |
B |
B |
R |
p3 |
| q5 |
B |
B |
R |
p2 |
| q6 |
B |
B |
R |
p1 |
|
The Number of States = 14 which
in binary is 1110
The Number of Symbols = 3 which in binary is 11
The mapping between state names and binary name is:
q0 > 1, q1 > 10, q2 > 11, q3 >
100, q4 > 101,
q5 > 110, q6 > 111, p1 > 1000, p2 >
1001,
p3 > 1010, p4 > 1011, p5 > 1100, p6 >1101,
qh > 1110
The mapping between symbols and
binary name is:
# > 1, A > 10,
B > 11
And thus mapping for each tuple
is:
Tuple: q0 A A R q1 --> 1X10X10X10XRY
Tuple: q1 A A R q2 --> 10X10X11X10XRY
Tuple: q2 A A R q3 --> 11X10X100X10XRY
Tuple: q3 A A R q4 --> 100X10X101X10XRY
Tuple: q4 A A R q5 --> 101X10X110X10XRY
Tuple: q5 A A R q6 --> 110X10X111X10XRY
Tuple: q6 B B R p1 --> 111X11X1000X11XRY
Tuple: p1 B B R p2 --> 1000X11X1001X11XRY
Tuple: p2 B B R p3 --> 1001X11X1010X11XRY
Tuple: p3 B B R p4 --> 1010X11X1011X11XRY
Tuple: p4 B B R p5 --> 1011X11X1100X11XRY
Tuple: p5 B B R p6 --> 1100X11X1101X11XRY
Tuple: p6 # # R qh --> 1101X1X1110X1XRY
Tuple: q1 B B R p6 --> 10X11X1101X11XRY
Tuple: q2 B B R p5 --> 11X11X1100X11XRY
Tuple: q3 B B R p4 --> 100X11X1011X11XRY
Tuple: q4 B B R p3 --> 101X11X1010X11XRY
Tuple: q5 B B R p2 --> 110X11X1001X11XRY
Tuple: q6 B B R p1 --> 111X11X1000X11XRY
|
|
|
| |
This TM encoded as a string is:
Ý110!11@
1X10X10X1XRY10X10X10X10XRY10X11X10X11XRY10X1X11X1XLY11X11X100X1XLY100X11X100X11XLY
100X10X100X10XLY100X1X101X1XRY101X10X10X1XRY101X1X110X1XSY¦
Ý1110!11@
1X10X10X10XRY10X10X11X10XRY111X10X100X10XRY00X10X101X10XRY101X10X110X10XRY110X10X111X10XRY
111X11X1000X11XRY1000X11X1001X11XRY1001X11X1010X11XRY1010X11X1011X11XRY1011X11X1100X11XRY
1100X11X1101X11XRY1101X1X1110X1XRY10X11X1101X11XRY11X11X1100X11XRY100X11X1011X11XRY
101X11X1010X11XRY110X11X1001X11XRY111X11X1000X11XRY¦
where the different colorings
of the elements of the string are used to help you see the structure
of the machine encoded in this vocabulary.
The Length of the Encoded TM
is 323
For this example we considered
the input string #AB which is encoded as W11!1%10%11$
The Length of the Encoded Start
String is 12
Thus the input to a U might be
the string:
W11!1%10%11$1X10X10X10XRY10X10X11X10XRY111X10X100X10XRY00X10X101X10XRY101X10X110X10XRY
110X10X111X10XRY111X11X1000X11XRY1000X11X1001X11XRY1001X11X1010X11XRY1010X11X1011X11XRY
1011X11X1100X11XRY1100X11X1101X11XRY1101X1X1110X1XRY10X11X1101X11XRY11X11X1100X11XRY
100X11X1011X11XRY101X11X1010X11XRY110X11X1001X11XRY111X11X1000X11XRY¦
The Total length is 335
|
|
|