Production Rule Architectures

     In the 1970's AI researchers began to experiment with information processing architectures that included the capacity to invoke procedures based on the matching of a pattern...thus, the term pattern matching or pattern based invocation. Traditional computational languages employed what is often termed hierarchical control. There is a top-level function which explicitly calls or invokes other functions. Control is released to a function; and when that function has completed, control is returned to the higher level calling function. And in order to call or invoke a function the calling function must call the lower level function by name which ultimately means that it is called by its address. Think of looking for a business, say a dry cleaning firm, but having to know the name of the specific business, say "AAA VeryCleen" in order to retrieve their address and thus avail oneself of their dry cleaning service. Contrast this with something like the yellow pages where businesses are not listed by their name but by some set of words that convey what they will do for you; e.g. "Dry Cleaning," "Auto Repair," etc.

     If the computation involves computing some well-defined function, then these distinctions are of little consequence. It you wish to invert a matrix, then one can lay out in advance; and, once and for all, exactly what procedures are required and when they should be accessed. However, the human mind (and an intelligent artifact that we might wish to design) is typically interacting with input from an external world. And, this input is really quite different from the input that appears on the tape of a Turing Machine. In the Turing Machine model the input string has a well-defined beginning and end at the start of the computation. And, only the actions of the Turing Machine can alter the contents of the tape. The external world is typically less well-behaved. And, on reflection one might even wish to have an internal world that is less well-behaved. Instead of a well-defined hierarchy of control, researches began to consider what they referred to as a heterarchy of control (it was the 70's after all!).

     It was at this point in time that AI was focusing on two research areas; namely, natural language understanding and expert systems. Both of these areas of research brought researchers face to face with the problem of representing and using appropriately vast amounts of knowledge. Prior to this much of the research had been on what were often referred to as toy domains - simple puzzle problems; blocks world problems, and the like. These were excellent domains within which to develop and evaluate search strategies, but they provided little insight into problems of how to effectively search for solutions in "knowledge-rich" domains.

     Although the typical computer language utilizes a hierarchical control regime; one can embed a computer language of one's own design within an existing language. LISP was the main language used in AI research at this time. Any computer language can be thought of as a kind of "virtual machine." A programming language provides two things:

  • data structures together with basic functions and predicates defined for those data structures; and,
  • control constructs.

     These two components of a language may closely mirror the actual computational capabilities designed into the hardware or they may be quite removed from these capabilities. For example, the design of LISP was heavily influenced by the Lambda Calculus. Its "native" control construct is that of recursion. However, although most LISPers favored the mathematical elegance gained though the use of recursion, one might nonetheless implement a function using an iterative control structure since this would often run faster on the computer platform on which the program was running. Or, to take an extreme example, LISP "pretends" that memory will always be available and can be allocated to a process as it is needed. In fact, memory was a very scarce commodity within the machines of the 70's. In order to try to maintain this pretense the implementers of LISP provided a procedure called "garbage collection." When memory became low (or the procedure was invoked within a program) this procedure systematically and exhaustively checked to see whether memory that had been allocated to a process was now free and if so this memory was reclaimed for "reuse." (Something analogous to this happens when or if a browser clears items in its cache memory.)

     Now if you spend much of your time thinking in terms of some programming language such as LISP, then it is not at all a leap to think of this language as running on a "virtual" machine whose "hardware" characteristics were the same as the software characteristics of the language. (Of course, when garbage collection interrupted your program's execution or you exceeded the allowed stack depth, you were compelled to return to reality.) And, LISP is a language within which one could fairly easily embed a language of your own design. The design for the embedded languages were often referred to as an architecture.

     One of the more important architectures that was created during this period was referred to as production rule systems. This name probably came about because they could be thought of as distant cousins of the formal model of computation known as Post Production Systems developed in the 1930's by Emil Post. In a "pure" production rule system knowledge and procedure are completely confounded. All knowledge is represented as production rules. The form of this "data structure" was a generalization of the kind of 'if .... then...' form that we encountered in our brief excursion into an examination of automata. The 'if clause' consists of a set of conditions. But these conditions are much more complex than a symbol and a current state. There may be many conditions all of which must be established as "true" in the current state of the machine (the conditions are given a conjunctive interpretation); and these conditions may be expressed in a language that allows expressions which may include variables. The variables are given an "existential" interpretation. And, the 'then clause' is a sequence of action schema whose variables are bound by the values that are passed from the bindings that have been established in the process of matching the variables that occurred in the 'if clause.'

     But how do such complex "instructions" get invoked? Here is where a basic control regime is required. We noted at the start of this page that in traditional computer architectures the address of a function is required to cause the function to be executed. This is what is termed the fetch execute cycle. The basic mechanism of control is a cycle that involves fetching a function and then executing that function. The production rule systems substitute a basic control regime that is referred to as a recognition act cycle. A memory structure, sometimes referred to as working memory; is identified as the area where temporary expressions are stored. During the recognition phase all production rules are identified whose 'if clause' can be satisfied by some subset of these temporary expressions. This set of rules is often referred to as the Conflict Set. The conflict set contains all of the rules that are enabled by the current contents of the working memory. The idea is that the expressions in working memory reflect the system's ongoing computational processes, goals, and parameters for externally directed actions. The recognition phase is functionally a retrieval phase where knowledge is retrieved that may be relevant to the system's activities.

     The act portion of the cycle involves executing the 'then clause' or actions of some subset of production rules that are in the Conflict Set. In most production rule systems a conflict resolution phase is implemented that results in a single production rule from the Conflict Set being chosen for execution. Some of the actions in the production rule may add to or alter in some way the contents of working memory. Thus, the contents or working memory will typically change on each cycle. Additionally, it is sometimes assumed that an external world can alter working memory independently of the system's explicit actions.

Human Cognition - Table of Contents

 © Charles F. Schmidt