Implementing the Query System

Section 4.4.2 described how the query system works. Now we fill in the details by presenting a complete implementation of the system.

The Driver Loop and Instantiation

The driver loop for the query system repeatedly reads input expressions. If the expression is a rule or assertion to be added to the data base, then the information is added. Otherwise the expression is assumed to be a query. The driver passes this query to the evaluator qeval together with an initial frame stream consisting of a single empty frame. The result of the evaluation is a stream of frames generated by satisfying the query with variable values found in the data base. These frames are used to form a new stream consisting of copies of the original query in which the variables are instantiated with values supplied by the stream of frames, and this final stream is printed at the terminal:

(define input-prompt ";;; Query input:") (define output-prompt ";;; Query results:") (define (query-driver-loop) (prompt-for-input input-prompt) (let ((q (query-syntax-process (read)))) (cond ((assertion-to-be-added? q) (add-rule-or-assertion! (add-assertion-body q)) (newline) (display "Assertion added to data base.") (query-driver-loop)) (else (newline) (display output-prompt) (display-stream (stream-map (lambda (frame) (instantiate q frame (lambda (v f) (contract-question-mark v)))) (qeval q (singleton-stream '())))) (query-driver-loop)))))

Here, as in the other evaluators in this chapter, we use an abstract syntax for the expressions of the query language. The implementation of the expression syntax, including the predicate assertion-to-be-added? and the selector add-assertion-body, is given in section 4.4.4.7. Add-rule-or-assertion! is defined in section 4.4.4.5.

Before doing any processing on an input expression, the driver loop transforms it syntactically into a form that makes the processing more efficient. This involves changing the representation of pattern variables. When the query is instantiated, any variables that remain unbound are transformed back to the input representation before being printed. These transformations are performed by the two procedures query-syntax-process and contract-question-mark (section 4.4.4.7).

To instantiate an expression, we copy it, replacing any variables in the expression by their values in a given frame. The values are themselves instantiated, since they could contain variables (for example, if ?x in exp is bound to ?y as the result of unification and ?y is in turn bound to 5). The action to take if a variable cannot be instantiated is given by a procedural argument to instantiate.

(define (instantiate exp frame unbound-var-handler) (define (copy exp) (cond ((var? exp) (let ((binding (binding-in-frame exp frame))) (if binding (copy (binding-value binding)) (unbound-var-handler exp frame)))) ((pair? exp) (cons (copy (car exp)) (copy (cdr exp)))) (else exp))) (copy exp))

The Evaluator

The qeval procedure, called by the query-driver-loop, is the basic evaluator of the query system. It takes as inputs a query and a stream of frames, and it returns a stream of extended frames. It identifies special forms by a data-directed dispatch using get and put, just as we did in implementing generic operations in chapter 2. Any query that is not identified as a special form is assumed to be a simple query, to be processed by simple-query.

(define (qeval query frame-stream) (let ((qproc (get (type query) 'qeval))) (if qproc (qproc (contents query) frame-stream) (simple-query query frame-stream))))

Type and contents, defined in section 4.4.4.7, implement the abstract syntax of the special forms.

Simple queries

The simple-query procedure handles simple queries. It takes as arguments a simple query (a pattern) together with a stream of frames, and it returns the stream formed by extending each frame by all data-base matches of the query.

(define (simple-query query-pattern frame-stream) (stream-flatmap (lambda (frame) (stream-append-delayed (find-assertions query-pattern frame) (delay (apply-rules query-pattern frame)))) frame-stream))

For each frame in the input stream, we use find-assertions (section 4.4.4.3) to match the pattern against all assertions in the data base, producing a stream of extended frames, and we use apply-rules (section 4.4.4.4) to apply all possible rules, producing another stream of extended frames. These two streams are combined (using stream-append-delayed, section 4.4.4.6) to make a stream of all the ways that the given pattern can be satisfied consistent with the original frame (see exercise 4.71). The streams for the individual input frames are combined using stream-flatmap (section 4.4.4.6) to form one large stream of all the ways that any of the frames in the original input stream can be extended to produce a match with the given pattern.

Compound queries

And queries are handled as illustrated in figure 4.5 by the conjoin procedure. Conjoin takes as inputs the conjuncts and the frame stream and returns the stream of extended frames. First, conjoin processes the stream of frames to find the stream of all possible frame extensions that satisfy the first query in the conjunction. Then, using this as the new frame stream, it recursively applies conjoin to the rest of the queries.

(define (conjoin conjuncts frame-stream) (if (empty-conjunction? conjuncts) frame-stream (conjoin (rest-conjuncts conjuncts) (qeval (first-conjunct conjuncts) frame-stream))))

Or queries are handled similarly, as shown in figure 4.6. The output streams for the various disjuncts of the or are computed separately and merged using the interleave-delayed procedure from section 4.4.4.6. (See exercises 4.71 and 4.72.)

(define (disjoin disjuncts frame-stream) (if (empty-disjunction? disjuncts) the-empty-stream (interleave-delayed (qeval (first-disjunct disjuncts) frame-stream) (delay (disjoin (rest-disjuncts disjuncts) frame-stream))))) (put 'or 'qeval disjoin)

The predicates and selectors for the syntax of conjuncts and disjuncts are given in section 4.4.4.7.

Filters

Not is handled by the method outlined in section 4.4.2. We attempt to extend each frame in the input stream to satisfy the query being negated, and we include a given frame in the output stream only if it cannot be extended.

(define (negate operands frame-stream) (stream-flatmap (lambda (frame) (if (stream-null? (qeval (negated-query operands) (singleton-stream frame))) (singleton-stream frame) the-empty-stream)) frame-stream)) (put 'not 'qeval negate)

Lisp-value is a filter similar to not. Each frame in the stream is used to instantiate the variables in the pattern, the indicated predicate is applied, and the frames for which the predicate returns false are filtered out of the input stream. An error results if there are unbound pattern variables.

(define (lisp-value call frame-stream) (stream-flatmap (lambda (frame) (if (execute (instantiate call frame (lambda (v f) (error "Unknown pat var -- LISP-VALUE" v)))) (singleton-stream frame) the-empty-stream)) frame-stream)) (put 'lisp-value 'qeval lisp-value)

Execute, which applies the predicate to the arguments, must eval the predicate expression to get the procedure to apply. However, it must not evaluate the arguments, since they are already the actual arguments, not expressions whose evaluation (in Lisp) will produce the arguments. Note that execute is implemented using eval and apply from the underlying Lisp system.

(define (execute exp) (apply (eval (predicate exp) user-initial-environment) (args exp)))

The always-true special form provides for a query that is always satisfied. It ignores its contents (normally empty) and simply passes through all the frames in the input stream. Always-true is used by the rule-body selector (section 4.4.4.7) to provide bodies for rules that were defined without bodies (that is, rules whose conclusions are always satisfied).

(define (always-true ignore frame-stream) frame-stream) (put 'always-true 'qeval always-true)

The selectors that define the syntax of not and lisp-value are given in section 4.4.4.7.

Finding Assertions by Pattern Matching

Find-assertions, called by simple-query (section 4.4.4.2), takes as input a pattern and a frame. It returns a stream of frames, each extending the given one by a data-base match of the given pattern. It uses fetch-assertions (section 4.4.4.5) to get a stream of all the assertions in the data base that should be checked for a match against the pattern and the frame. The reason for fetch-assertions here is that we can often apply simple tests that will eliminate many of the entries in the data base from the pool of candidates for a successful match. The system would still work if we eliminated fetch-assertions and simply checked a stream of all assertions in the data base, but the computation would be less efficient because we would need to make many more calls to the matcher.

(define (find-assertions pattern frame) (stream-flatmap (lambda (datum) (check-an-assertion datum pattern frame)) (fetch-assertions pattern frame)))

Check-an-assertion takes as arguments a pattern, a data object (assertion), and a frame and returns either a one-element stream containing the extended frame or the-empty-stream if the match fails.

(define (check-an-assertion assertion query-pat query-frame) (let ((match-result (pattern-match query-pat assertion query-frame))) (if (eq? match-result 'failed) the-empty-stream (singleton-stream match-result))))

The basic pattern matcher returns either the symbol failed or an extension of the given frame. The basic idea of the matcher is to check the pattern against the data, element by element, accumulating bindings for the pattern variables. If the pattern and the data object are the same, the match succeeds and we return the frame of bindings accumulated so far. Otherwise, if the pattern is a variable we extend the current frame by binding the variable to the data, so long as this is consistent with the bindings already in the frame. If the pattern and the data are both pairs, we (recursively) match the car of the pattern against the car of the data to produce a frame; in this frame we then match the cdr of the pattern against the cdr of the data. If none of these cases are applicable, the match fails and we return the symbol failed.

(define (pattern-match pat dat frame) (cond ((eq? frame 'failed) 'failed) ((equal? pat dat) frame) ((var? pat) (extend-if-consistent pat dat frame)) ((and (pair? pat) (pair? dat)) (pattern-match (cdr pat) (cdr dat) (pattern-match (car pat) (car dat) frame))) (else 'failed)))

Here is the procedure that extends a frame by adding a new binding, if this is consistent with the bindings already in the frame:

(define (extend-if-consistent var dat frame) (let ((binding (binding-in-frame var frame))) (if binding (pattern-match (binding-value binding) dat frame) (extend var dat frame))))

If there is no binding for the variable in the frame, we simply add the binding of the variable to the data. Otherwise we match, in the frame, the data against the value of the variable in the frame. If the stored value contains only constants, as it must if it was stored during pattern matching by extend-if-consistent, then the match simply tests whether the stored and new values are the same. If so, it returns the unmodified frame; if not, it returns a failure indication. The stored value may, however, contain pattern variables if it was stored during unification (see section 4.4.4.4). The recursive match of the stored pattern against the new data will add or check bindings for the variables in this pattern. For example, suppose we have a frame in which ?x is bound to (f ?y) and ?y is unbound, and we wish to augment this frame by a binding of ?x to (f b). We look up ?x and find that it is bound to (f ?y). This leads us to match (f ?y) against the proposed new value (f b) in the same frame. Eventually this match extends the frame by adding a binding of ?y to b. ?X remains bound to (f ?y). We never modify a stored binding and we never store more than one binding for a given variable.

The procedures used by extend-if-consistent to manipulate bindings are defined in section 4.4.4.8.

Patterns with dotted tails

If a pattern contains a dot followed by a pattern variable, the pattern variable matches the rest of the data list (rather than the next element of the data list), just as one would expect with the dotted-tail notation described in exercise 2.20. Although the pattern matcher we have just implemented doesn't look for dots, it does behave as we want. This is because the Lisp read primitive, which is used by query-driver-loop to read the query and represent it as a list structure, treats dots in a special way.

When read sees a dot, instead of making the next item be the next element of a list (the car of a cons whose cdr will be the rest of the list) it makes the next item be the cdr of the list structure. For example, the list structure produced by read for the pattern (computer ?type) could be constructed by evaluating the expression (cons 'computer (cons '?type '())), and that for (computer . ?type) could be constructed by evaluating the expression (cons 'computer '?type).

Thus, as pattern-match recursively compares cars and cdrs of a data list and a pattern that had a dot, it eventually matches the variable after the dot (which is a cdr of the pattern) against a sublist of the data list, binding the variable to that list. For example, matching the pattern (computer . ?type) against (computer programmer trainee) will match ?type against the list (programmer trainee).

Rules and Unification

Apply-rules is the rule analog of find-assertions (section 4.4.4.3). It takes as input a pattern and a frame, and it forms a stream of extension frames by applying rules from the data base. Stream-flatmap maps apply-a-rule down the stream of possibly applicable rules (selected by fetch-rules, section 4.4.4.5) and combines the resulting streams of frames.

(define (apply-rules pattern frame) (stream-flatmap (lambda (rule) (apply-a-rule rule pattern frame)) (fetch-rules pattern frame)))

Apply-a-rule applies rules using the method outlined in section 4.4.2. It first augments its argument frame by unifying the rule conclusion with the pattern in the given frame. If this succeeds, it evaluates the rule body in this new frame.

Before any of this happens, however, the program renames all the variables in the rule with unique new names. The reason for this is to prevent the variables for different rule applications from becoming confused with each other. For instance, if two rules both use a variable named ?x, then each one may add a binding for ?x to the frame when it is applied. These two ?x's have nothing to do with each other, and we should not be fooled into thinking that the two bindings must be consistent. Rather than rename variables, we could devise a more clever environment structure; however, the renaming approach we have chosen here is the most straightforward, even if not the most efficient. (See exercise 4.79.) Here is the apply-a-rule procedure:

(define (apply-a-rule rule query-pattern query-frame) (let ((clean-rule (rename-variables-in rule))) (let ((unify-result (unify-match query-pattern (conclusion clean-rule) query-frame))) (if (eq? unify-result 'failed) the-empty-stream (qeval (rule-body clean-rule) (singleton-stream unify-result))))))

The selectors rule-body and conclusion that extract parts of a rule are defined in section 4.4.4.7.

We generate unique variable names by associating a unique identifier (such as a number) with each rule application and combining this identifier with the original variable names. For example, if the rule-application identifier is 7, we might change each ?x in the rule to ?x-7 and each ?y in the rule to ?y-7. (Make-new-variable and new-rule-application-id are included with the syntax procedures in section 4.4.4.7.)

(define (rename-variables-in rule) (let ((rule-application-id (new-rule-application-id))) (define (tree-walk exp) (cond ((var? exp) (make-new-variable exp rule-application-id)) ((pair? exp) (cons (tree-walk (car exp)) (tree-walk (cdr exp)))) (else exp))) (tree-walk rule)))

The unification algorithm is implemented as a procedure that takes as inputs two patterns and a frame and returns either the extended frame or the symbol failed. The unifier is like the pattern matcher except that it is symmetrical -- variables are allowed on both sides of the match. Unify-match is basically the same as pattern-match, except that there is extra code (marked "***" below) to handle the case where the object on the right side of the match is a variable.

(define (unify-match p1 p2 frame) (cond ((eq? frame 'failed) 'failed) ((equal? p1 p2) frame) ((var? p1) (extend-if-possible p1 p2 frame)) ((var? p2) (extend-if-possible p2 p1 frame)) ; *** ((and (pair? p1) (pair? p2)) (unify-match (cdr p1) (cdr p2) (unify-match (car p1) (car p2) frame))) (else 'failed)))

In unification, as in one-sided pattern matching, we want to accept a proposed extension of the frame only if it is consistent with existing bindings. The procedure extend-if-possible used in unification is the same as the extend-if-consistent used in pattern matching except for two special checks, marked "***" in the program below. In the first case, if the variable we are trying to match is not bound, but the value we are trying to match it with is itself a (different) variable, it is necessary to check to see if the value is bound, and if so, to match its value. If both parties to the match are unbound, we may bind either to the other.

The second check deals with attempts to bind a variable to a pattern that includes that variable. Such a situation can occur whenever a variable is repeated in both patterns. Consider, for example, unifying the two patterns (?x ?x) and (?y <expression involving ?y>) in a frame where both ?x and ?y are unbound. First ?x is matched against ?y, making a binding of ?x to ?y. Next, the same ?x is matched against the given expression involving ?y. Since ?x is already bound to ?y, this results in matching ?y against the expression. If we think of the unifier as finding a set of values for the pattern variables that make the patterns the same, then these patterns imply instructions to find a ?y such that ?y is equal to the expression involving ?y. There is no general method for solving such equations, so we reject such bindings; these cases are recognized by the predicate depends-on?.^{noteIn general, unifying ?y with an expression involving ?y would require our being able to find a fixed point of the equation ?y = <expression involving ?y>. It is sometimes possible to syntactically form an expression that appears to be the solution. For example, ?y� = �(f ?y) seems to have the fixed point (f (f (f ... ))), which we can produce by beginning with the expression (f ?y) and repeatedly substituting (f ?y) for ?y. Unfortunately, not every such equation has a meaningful fixed point. The issues that arise here are similar to the issues of manipulating infinite series in mathematics. For example, we know that 2 is the solution to the equation y = 1 + y/2. Beginning with the expression 1 + y/2 and repeatedly substituting 1 + y/2 for y gives which leads to However, if we try the same manipulation beginning with the observation that - 1 is the solution to the equation y = 1 + 2y, we obtain which leads to Although the formal manipulations used in deriving these two equations are identical, the first result is a valid assertion about infinite series but the second is not. Similarly, for our unification results, reasoning with an arbitrary syntactically constructed expression may lead to errors.} On the other hand, we do not want to reject attempts to bind a variable to itself. For example, consider unifying (?x ?x) and (?y ?y). The second attempt to bind ?x to ?y matches ?y (the stored value of ?x) against ?y (the new value of ?x). This is taken care of by the equal? clause of unify-match.

(define (extend-if-possible var val frame) (let ((binding (binding-in-frame var frame))) (cond (binding (unify-match (binding-value binding) val frame)) ((var? val) ; *** (let ((binding (binding-in-frame val frame))) (if binding (unify-match var (binding-value binding) frame) (extend var val frame)))) ((depends-on? val var frame) ; *** 'failed) (else (extend var val frame)))))

Depends-on? is a predicate that tests whether an expression proposed to be the value of a pattern variable depends on the variable. This must be done relative to the current frame because the expression may contain occurrences of a variable that already has a value that depends on our test variable. The structure of depends-on? is a simple recursive tree walk in which we substitute for the values of variables whenever necessary.

(define (depends-on? exp var frame) (define (tree-walk e) (cond ((var? e) (if (equal? var e) true (let ((b (binding-in-frame e frame))) (if b (tree-walk (binding-value b)) false)))) ((pair? e) (or (tree-walk (car e)) (tree-walk (cdr e)))) (else false))) (tree-walk exp))

Maintaining the Data Base

One important problem in designing logic programming languages is that of arranging things so that as few irrelevant data-base entries as possible will be examined in checking a given pattern. In our system, in addition to storing all assertions in one big stream, we store all assertions whose cars are constant symbols in separate streams, in a table indexed by the symbol. To fetch an assertion that may match a pattern, we first check to see if the car of the pattern is a constant symbol. If so, we return (to be tested using the matcher) all the stored assertions that have the same car. If the pattern's car is not a constant symbol, we return all the stored assertions. Cleverer methods could also take advantage of information in the frame, or try also to optimize the case where the car of the pattern is not a constant symbol. We avoid building our criteria for indexing (using the car, handling only the case of constant symbols) into the program; instead we call on predicates and selectors that embody our criteria.

(define THE-ASSERTIONS the-empty-stream) (define (fetch-assertions pattern frame) (if (use-index? pattern) (get-indexed-assertions pattern) (get-all-assertions))) (define (get-all-assertions) THE-ASSERTIONS) (define (get-indexed-assertions pattern) (get-stream (index-key-of pattern) 'assertion-stream))

Get-stream looks up a stream in the table and returns an empty stream if nothing is stored there.

(define (get-stream key1 key2) (let ((s (get key1 key2))) (if s s the-empty-stream)))

Rules are stored similarly, using the car of the rule conclusion. Rule conclusions are arbitrary patterns, however, so they differ from assertions in that they can contain variables. A pattern whose car is a constant symbol can match rules whose conclusions start with a variable as well as rules whose conclusions have the same car. Thus, when fetching rules that might match a pattern whose car is a constant symbol we fetch all rules whose conclusions start with a variable as well as those whose conclusions have the same car as the pattern. For this purpose we store all rules whose conclusions start with a variable in a separate stream in our table, indexed by the symbol ?.

(define THE-RULES the-empty-stream) (define (fetch-rules pattern frame) (if (use-index? pattern) (get-indexed-rules pattern) (get-all-rules))) (define (get-all-rules) THE-RULES) (define (get-indexed-rules pattern) (stream-append (get-stream (index-key-of pattern) 'rule-stream) (get-stream '? 'rule-stream)))

Add-rule-or-assertion! is used by query-driver-loop to add assertions and rules to the data base. Each item is stored in the index, if appropriate, and in a stream of all assertions or rules in the data base.

(define (add-rule-or-assertion! assertion) (if (rule? assertion) (add-rule! assertion) (add-assertion! assertion))) (define (add-assertion! assertion) (store-assertion-in-index assertion) (let ((old-assertions THE-ASSERTIONS)) (set! THE-ASSERTIONS (cons-stream assertion old-assertions)) 'ok)) (define (add-rule! rule) (store-rule-in-index rule) (let ((old-rules THE-RULES)) (set! THE-RULES (cons-stream rule old-rules)) 'ok))

To actually store an assertion or a rule, we check to see if it can be indexed. If so, we store it in the appropriate stream.

(define (store-assertion-in-index assertion) (if (indexable? assertion) (let ((key (index-key-of assertion))) (let ((current-assertion-stream (get-stream key 'assertion-stream))) (put key 'assertion-stream (cons-stream assertion current-assertion-stream)))))) (define (store-rule-in-index rule) (let ((pattern (conclusion rule))) (if (indexable? pattern) (let ((key (index-key-of pattern))) (let ((current-rule-stream (get-stream key 'rule-stream))) (put key 'rule-stream (cons-stream rule current-rule-stream)))))))

The following procedures define how the data-base index is used. A pattern (an assertion or a rule conclusion) will be stored in the table if it starts with a variable or a constant symbol.

The key under which a pattern is stored in the table is either ? (if it starts with a variable) or the constant symbol with which it starts.

The index will be used to retrieve items that might match a pattern if the pattern starts with a constant symbol.

Exercise 4.70. What is the purpose of the let bindings in the procedures add-assertion! and add-rule! ? What would be wrong with the following implementation of add-assertion! ? Hint: Recall the definition of the infinite stream of ones in section 3.5.2: (define ones (cons-stream 1 ones)).

(define (add-assertion! assertion) (store-assertion-in-index assertion) (set! THE-ASSERTIONS (cons-stream assertion THE-ASSERTIONS)) 'ok)

Stream Operations

The query system uses a few stream operations that were not presented in chapter 3.

Stream-append-delayed and interleave-delayed are just like stream-append and interleave (section 3.5.3), except that they take a delayed argument (like the integral procedure in section 3.5.4). This postpones looping in some cases (see exercise 4.71).

(define (stream-append-delayed s1 delayed-s2) (if (stream-null? s1) (force delayed-s2) (cons-stream (stream-car s1) (stream-append-delayed (stream-cdr s1) delayed-s2)))) (define (interleave-delayed s1 delayed-s2) (if (stream-null? s1) (force delayed-s2) (cons-stream (stream-car s1) (interleave-delayed (force delayed-s2) (delay (stream-cdr s1))))))

Stream-flatmap, which is used throughout the query evaluator to map a procedure over a stream of frames and combine the resulting streams of frames, is the stream analog of the flatmap procedure introduced for ordinary lists in section 2.2.3. Unlike ordinary flatmap, however, we accumulate the streams with an interleaving process, rather than simply appending them (see exercises 4.72 and 4.73).

(define (stream-flatmap proc s) (flatten-stream (stream-map proc s))) (define (flatten-stream stream) (if (stream-null? stream) the-empty-stream (interleave-delayed (stream-car stream) (delay (flatten-stream (stream-cdr stream))))))

The evaluator also uses the following simple procedure to generate a stream consisting of a single element:

(define (singleton-stream x) (cons-stream x the-empty-stream))

Query Syntax Procedures

Type and contents, used by qeval (section 4.4.4.2), specify that a special form is identified by the symbol in its car. They are the same as the type-tag and contents procedures in section 2.4.2, except for the error message.

(define (type exp) (if (pair? exp) (car exp) (error "Unknown expression TYPE" exp))) (define (contents exp) (if (pair? exp) (cdr exp) (error "Unknown expression CONTENTS" exp)))

The following procedures, used by query-driver-loop (in section 4.4.4.1), specify that rules and assertions are added to the data base by expressions of the form (assert! <rule-or-assertion>):

(define (assertion-to-be-added? exp) (eq? (type exp) 'assert!)) (define (add-assertion-body exp) (car (contents exp)))

Here are the syntax definitions for the and, or, not, and lisp-value special forms (section 4.4.4.2):

(define (empty-conjunction? exps) (null? exps)) (define (first-conjunct exps) (car exps)) (define (rest-conjuncts exps) (cdr exps)) (define (empty-disjunction? exps) (null? exps)) (define (first-disjunct exps) (car exps)) (define (rest-disjuncts exps) (cdr exps)) (define (negated-query exps) (car exps)) (define (predicate exps) (car exps)) (define (args exps) (cdr exps))

The following three procedures define the syntax of rules:

(define (rule? statement) (tagged-list? statement 'rule)) (define (conclusion rule) (cadr rule)) (define (rule-body rule) (if (null? (cddr rule)) '(always-true) (caddr rule)))

Query-driver-loop (section 4.4.4.1) calls query-syntax-process to transform pattern variables in the expression, which have the form ?symbol, into the internal format (? symbol). That is to say, a pattern such as (job ?x ?y) is actually represented internally by the system as (job (? x) (? y)). This increases the efficiency of query processing, since it means that the system can check to see if an expression is a pattern variable by checking whether the car of the expression is the symbol ?, rather than having to extract characters from the symbol. The syntax transformation is accomplished by the following procedure:^{noteMost Lisp systems give the user the ability to modify the ordinary read procedure to perform such transformations by defining reader macro characters. Quoted expressions are already handled in this way: The reader automatically translates 'expression into (quote expression) before the evaluator sees it. We could arrange for ?expression to be transformed into (? expression) in the same way; however, for the sake of clarity we have included the transformation procedure here explicitly. Expand-question-mark and contract-question-mark use several procedures with string in their names. These are Scheme primitives.}

(define (query-syntax-process exp) (map-over-symbols expand-question-mark exp)) (define (map-over-symbols proc exp) (cond ((pair? exp) (cons (map-over-symbols proc (car exp)) (map-over-symbols proc (cdr exp)))) ((symbol? exp) (proc exp)) (else exp))) (define (expand-question-mark symbol) (let ((chars (symbol->string symbol))) (if (string=? (substring chars 0 1) "?") (list '? (string->symbol (substring chars 1 (string-length chars)))) symbol)))

Once the variables are transformed in this way, the variables in a pattern are lists starting with ?, and the constant symbols (which need to be recognized for data-base indexing, section 4.4.4.5) are just the symbols.

(define (var? exp) (tagged-list? exp '?)) (define (constant-symbol? exp) (symbol? exp))

Unique variables are constructed during rule application (in section 4.4.4.4) by means of the following procedures. The unique identifier for a rule application is a number, which is incremented each time a rule is applied.

(define rule-counter 0) (define (new-rule-application-id) (set! rule-counter (+ 1 rule-counter)) rule-counter) (define (make-new-variable var rule-application-id) (cons '? (cons rule-application-id (cdr var))))

When query-driver-loop instantiates the query to print the answer, it converts any unbound pattern variables back to the right form for printing, using

(define (contract-question-mark variable) (string->symbol (string-append "?" (if (number? (cadr variable)) (string-append (symbol->string (caddr variable)) "-" (number->string (cadr variable))) (symbol->string (cadr variable))))))

Frames and Bindings

Frames are represented as lists of bindings, which are variable-value pairs:

(define (make-binding variable value) (cons variable value)) (define (binding-variable binding) (car binding)) (define (binding-value binding) (cdr binding)) (define (binding-in-frame variable frame) (assoc variable frame)) (define (extend variable value frame) (cons (make-binding variable value) frame))

Exercise 4.71. Louis Reasoner wonders why the simple-query and disjoin procedures (section 4.4.4.2) are implemented using explicit delay operations, rather than being defined as follows:

(define (simple-query query-pattern frame-stream) (stream-flatmap (lambda (frame) (stream-append (find-assertions query-pattern frame) (apply-rules query-pattern frame))) frame-stream)) (define (disjoin disjuncts frame-stream) (if (empty-disjunction? disjuncts) the-empty-stream (interleave (qeval (first-disjunct disjuncts) frame-stream) (disjoin (rest-disjuncts disjuncts) frame-stream))))

Can you give examples of queries where these simpler definitions would lead to undesirable behavior?

Exercise 4.72. Why do disjoin and stream-flatmap interleave the streams rather than simply append them? Give examples that illustrate why interleaving works better. (Hint: Why did we use interleave in section 3.5.3?)

Exercise 4.73. Why does flatten-stream use delay explicitly? What would be wrong with defining it as follows:

(define (flatten-stream stream) (if (stream-null? stream) the-empty-stream (interleave (stream-car stream) (flatten-stream (stream-cdr stream)))))

Exercise 4.74. Alyssa P. Hacker proposes to use a simpler version of stream-flatmap in negate, lisp-value, and find-assertions. She observes that the procedure that is mapped over the frame stream in these cases always produces either the empty stream or a singleton stream, so no interleaving is needed when combining these streams.

a. Fill in the missing expressions in Alyssa's program.

(define (simple-stream-flatmap proc s) (simple-flatten (stream-map proc s))) (define (simple-flatten stream) (stream-map <??> (stream-filter <??> stream)))

b. Does the query system's behavior change if we change it in this way?

Exercise 4.75. Implement for the query language a new special form called unique. Unique should succeed if there is precisely one item in the data base satisfying a specified query. For example,

(unique (job ?x (computer wizard)))

should print the one-item stream

(unique (job (Bitdiddle Ben) (computer wizard)))

since Ben is the only computer wizard, and

(unique (job ?x (computer programmer)))

should print the empty stream, since there is more than one computer programmer. Moreover,

(and (job ?x ?j) (unique (job ?anyone ?j)))

should list all the jobs that are filled by only one person, and the people who fill them.

There are two parts to implementing unique. The first is to write a procedure that handles this special form, and the second is to make qeval dispatch to that procedure. The second part is trivial, since qeval does its dispatching in a data-directed way. If your procedure is called uniquely-asserted, all you need to do is

(put 'unique 'qeval uniquely-asserted)

and qeval will dispatch to this procedure for every query whose type (car) is the symbol unique.

The real problem is to write the procedure uniquely-asserted. This should take as input the contents (cdr) of the unique query, together with a stream of frames. For each frame in the stream, it should use qeval to find the stream of all extensions to the frame that satisfy the given query. Any stream that does not have exactly one item in it should be eliminated. The remaining streams should be passed back to be accumulated into one big stream that is the result of the unique query. This is similar to the implementation of the not special form.

Test your implementation by forming a query that lists all people who supervise precisely one person.