Logic programming excels in providing interfaces to data bases for information retrieval. The query language we shall implement in this chapter is designed to be used in this way.
In order to illustrate what the query system does, we will show how it can be used to manage the data base of personnel records for Microshaft, a thriving high-technology company in the Boston area. The language provides pattern-directed access to personnel information and can also take advantage of general rules in order to make logical deductions.
The personnel data base for Microshaft contains assertions about company personnel. Here is the information about Ben Bitdiddle, the resident computer wizard:
(address (Bitdiddle Ben) (Slumerville (Ridge Road) 10))
(job (Bitdiddle Ben) (computer wizard))
(salary (Bitdiddle Ben) 60000)
Each assertion is a list (in this case a triple) whose elements can themselves be lists.
As resident wizard, Ben is in charge of the company's computer division, and he supervises two programmers and one technician. Here is the information about them:
(address (Hacker Alyssa P) (Cambridge (Mass Ave) 78))
(job (Hacker Alyssa P) (computer programmer))
(salary (Hacker Alyssa P) 40000)
(supervisor (Hacker Alyssa P) (Bitdiddle Ben))
(address (Fect Cy D) (Cambridge (Ames Street) 3))
(job (Fect Cy D) (computer programmer))
(salary (Fect Cy D) 35000)
(supervisor (Fect Cy D) (Bitdiddle Ben))
(address (Tweakit Lem E) (Boston (Bay State Road) 22))
(job (Tweakit Lem E) (computer technician))
(salary (Tweakit Lem E) 25000)
(supervisor (Tweakit Lem E) (Bitdiddle Ben))
There is also a programmer trainee, who is supervised by Alyssa:
(address (Reasoner Louis) (Slumerville (Pine Tree Road) 80))
(job (Reasoner Louis) (computer programmer trainee))
(salary (Reasoner Louis) 30000)
(supervisor (Reasoner Louis) (Hacker Alyssa P))
All of these people are in the computer division, as indicated by the word computer as the first item in their job descriptions.
Ben is a high-level employee. His supervisor is the company's big wheel himself:
(supervisor (Bitdiddle Ben) (Warbucks Oliver))
(address (Warbucks Oliver) (Swellesley (Top Heap Road)))
(job (Warbucks Oliver) (administration big wheel))
(salary (Warbucks Oliver) 150000)
Besides the computer division supervised by Ben, the company has an accounting division, consisting of a chief accountant and his assistant:
(address (Scrooge Eben) (Weston (Shady Lane) 10))
(job (Scrooge Eben) (accounting chief accountant))
(salary (Scrooge Eben) 75000)
(supervisor (Scrooge Eben) (Warbucks Oliver))
(address (Cratchet Robert) (Allston (N Harvard Street) 16))
(job (Cratchet Robert) (accounting scrivener))
(salary (Cratchet Robert) 18000)
(supervisor (Cratchet Robert) (Scrooge Eben))
There is also a secretary for the big wheel:
(address (Aull DeWitt) (Slumerville (Onion Square) 5))
(job (Aull DeWitt) (administration secretary))
(salary (Aull DeWitt) 25000)
(supervisor (Aull DeWitt) (Warbucks Oliver))
The data base also contains assertions about which kinds of jobs can be done by people holding other kinds of jobs. For instance, a computer wizard can do the jobs of both a computer programmer and a computer technician:
(can-do-job (computer wizard) (computer programmer))
(can-do-job (computer wizard) (computer technician))
A computer programmer could fill in for a trainee:
(can-do-job (computer programmer)
(computer programmer trainee))
(can-do-job (administration secretary)
(administration big wheel))
The query language allows users to retrieve information from the data base by posing queries in response to the system's prompt. For example, to find all computer programmers one can say
;;; Query input:
(job ?x (computer programmer))
The system will respond with the following items:
;;; Query results:
(job (Hacker Alyssa P) (computer programmer))
(job (Fect Cy D) (computer programmer))
The input query specifies that we are looking for entries in the data base that match a certain pattern. In this example, the pattern specifies entries consisting of three items, of which the first is the literal symbol job, the second can be anything, and the third is the literal list (computer programmer). The "anything" that can be the second item in the matching list is specified by a pattern variable, ?x. The general form of a pattern variable is a symbol, taken to be the name of the variable, preceded by a question mark. We will see below why it is useful to specify names for pattern variables rather than just putting ? into patterns to represent "anything." The system responds to a simple query by showing all entries in the data base that match the specified pattern.
A pattern can have more than one variable. For example, the query
(address ?x ?y)
will list all the employees' addresses.
A pattern can have no variables, in which case the query simply determines whether that pattern is an entry in the data base. If so, there will be one match; if not, there will be no matches.
The same pattern variable can appear more than once in a query, specifying that the same "anything" must appear in each position. This is why variables have names. For example,
(supervisor ?x ?x)
finds all people who supervise themselves (though there are no such assertions in our sample data base).
The query
(job ?x (computer ?type))
matches all job entries whose third item is a two-element list whose first item is computer:
(job (Bitdiddle Ben) (computer wizard))
(job (Hacker Alyssa P) (computer programmer))
(job (Fect Cy D) (computer programmer))
(job (Tweakit Lem E) (computer technician))
This same pattern does not match
(job (Reasoner Louis) (computer programmer trainee))
because the third item in the entry is a list of three elements, and the pattern's third item specifies that there should be two elements. If we wanted to change the pattern so that the third item could be any list beginning with computer, we could specifynoteThis uses the dotted-tail notation introduced in exercise 2.20.
(job ?x (computer . ?type))
For example,
(computer . ?type)
matches the data
(computer programmer trainee)
with ?type as the list (programmer trainee). It also matches the data
(computer programmer)
with ?type as the list (programmer), and matches the data
(computer)
with ?type as the empty list ().
We can describe the query language's processing of simple queries as follows:
Note that if the pattern has no variables, the query reduces to a determination of whether that pattern is in the data base. If so, the empty assignment, which assigns no values to variables, satisfies that pattern for that data base.
a. all people supervised by Ben Bitdiddle;
b. the names and jobs of all people in the accounting division;
c. the names and addresses of all people who live in Slumerville.
Simple queries form the primitive operations of the query language. In order to form compound operations, the query language provides means of combination. One thing that makes the query language a logic programming language is that the means of combination mirror the means of combination used in forming logical expressions: and, or, and not. (Here and, or, and not are not the Lisp primitives, but rather operations built into the query language.)
We can use and as follows to find the addresses of all the computer programmers:
(and (job ?person (computer programmer))
(address ?person ?where))
The resulting output is
(and (job (Hacker Alyssa P) (computer programmer))
(address (Hacker Alyssa P) (Cambridge (Mass Ave) 78)))
(and (job (Fect Cy D) (computer programmer))
(address (Fect Cy D) (Cambridge (Ames Street) 3)))
(and <query1> <query2> ... <queryn>)
is satisfied by all sets of values for the pattern variables that simultaneously satisfy <query1> ... <queryn>.
As for simple queries, the system processes a compound query by finding all assignments to the pattern variables that satisfy the query, then displaying instantiations of the query with those values.
Another means of constructing compound queries is through or. For example,
(or (supervisor ?x (Bitdiddle Ben))
(supervisor ?x (Hacker Alyssa P)))
will find all employees supervised by Ben Bitdiddle or Alyssa P. Hacker:
(or (supervisor (Hacker Alyssa P) (Bitdiddle Ben))
(supervisor (Hacker Alyssa P) (Hacker Alyssa P)))
(or (supervisor (Fect Cy D) (Bitdiddle Ben))
(supervisor (Fect Cy D) (Hacker Alyssa P)))
(or (supervisor (Tweakit Lem E) (Bitdiddle Ben))
(supervisor (Tweakit Lem E) (Hacker Alyssa P)))
(or (supervisor (Reasoner Louis) (Bitdiddle Ben))
(supervisor (Reasoner Louis) (Hacker Alyssa P)))
In general,
(or <query1> <query2> ... <queryn>)
is satisfied by all sets of values for the pattern variables that satisfy at least one of <query1> ... <queryn>.
Compound queries can also be formed with not. For example,
(and (supervisor ?x (Bitdiddle Ben))
(not (job ?x (computer programmer))))
finds all people supervised by Ben Bitdiddle who are not computer programmers. In general,
(not <query1>)
is satisfied by all assignments to the pattern variables that do not satisfy <query1>.noteActually, this description of not is valid only for simple cases. The real behavior of not is more complex. We will examine not's peculiarities in sections 4.4.2 and 4.4.3.
The final combining form is called lisp-value. When lisp-value is the first element of a pattern, it specifies that the next element is a Lisp predicate to be applied to the rest of the (instantiated) elements as arguments. In general,
(lisp-value <predicate> <arg1> ... <argn>)
will be satisfied by assignments to the pattern variables for which the <predicate> applied to the instantiated <arg1> ... <argn> is true. For example, to find all people whose salary is greater than $30,000 we could writenoteLisp-value should be used only to perform an operation not provided in the query language. In particular, it should not be used to test equality (since that is what the matching in the query language is designed to do) or inequality (since that can be done with the same rule shown below).
(and (salary ?person ?amount)
(lisp-value > ?amount 30000))
a. the names of all people who are supervised by Ben Bitdiddle, together with their addresses;
b. all people whose salary is less than Ben Bitdiddle's, together with their salary and Ben Bitdiddle's salary;
c. all people who are supervised by someone who is not in the computer division, together with the supervisor's name and job.
In addition to primitive queries and compound queries, the query language provides means for abstracting queries. These are given by rules. The rule
(rule (lives-near ?person-1 ?person-2)
(and (address ?person-1 (?town . ?rest-1))
(address ?person-2 (?town . ?rest-2))
(not (same ?person-1 ?person-2))))
specifies that two people live near each other if they live in the same town. The final not clause prevents the rule from saying that all people live near themselves. The same relation is defined by a very simple rule:noteNotice that we do not need same in order to make two things be the same: We just use the same pattern variable for each -- in effect, we have one thing instead of two things in the first place. For example, see ?town in the lives-near rule and ?middle-manager in the wheel rule below. Same is useful when we want to force two things to be different, such as ?person-1 and ?person-2 in the lives-near rule. Although using the same pattern variable in two parts of a query forces the same value to appear in both places, using different pattern variables does not force different values to appear. (The values assigned to different pattern variables may be the same or different.)
The following rule declares that a person is a "wheel" in an organization if he supervises someone who is in turn a supervisor:
(rule (wheel ?person)
(and (supervisor ?middle-manager ?person)
(supervisor ?x ?middle-manager)))
The general form of a rule is
(rule <conclusion> <body>)
where <conclusion> is a pattern and <body> is any query.noteWe will also allow rules without bodies, as in same, and we will interpret such a rule to mean that the rule conclusion is satisfied by any values of the variables. We can think of a rule as representing a large (even infinite) set of assertions, namely all instantiations of the rule conclusion with variable assignments that satisfy the rule body. When we described simple queries (patterns), we said that an assignment to variables satisfies a pattern if the instantiated pattern is in the data base. But the pattern needn't be explicitly in the data base as an assertion. It can be an implicit assertion implied by a rule. For example, the query
(lives-near ?x (Bitdiddle Ben))
results in
(lives-near (Reasoner Louis) (Bitdiddle Ben))
(lives-near (Aull DeWitt) (Bitdiddle Ben))
To find all computer programmers who live near Ben Bitdiddle, we can ask
(and (job ?x (computer programmer))
(lives-near ?x (Bitdiddle Ben)))
As in the case of compound procedures, rules can be used as parts of other rules (as we saw with the lives-near rule above) or even be defined recursively. For instance, the rule
(rule (outranked-by ?staff-person ?boss)
(or (supervisor ?staff-person ?boss)
(and (supervisor ?staff-person ?middle-manager)
(outranked-by ?middle-manager ?boss))))
says that a staff person is outranked by a boss in the organization if the boss is the person's supervisor or (recursively) if the person's supervisor is outranked by the boss.
a. all people who can replace Cy D. Fect;
b. all people who can replace someone who is being paid more than they are, together with the two salaries.
(meeting accounting (Monday 9am))
(meeting administration (Monday 10am))
(meeting computer (Wednesday 3pm))
(meeting administration (Friday 1pm))
Each of the above assertions is for a meeting of an entire division. Ben also adds an entry for the company-wide meeting that spans all the divisions. All of the company's employees attend this meeting.
(meeting whole-company (Wednesday 4pm))
a. On Friday morning, Ben wants to query the data base for all the meetings that occur that day. What query should he use?
b. Alyssa P. Hacker is unimpressed. She thinks it would be much more useful to be able to ask for her meetings by specifying her name. So she designs a rule that says that a person's meetings include all whole-company meetings plus all meetings of that person's division. Fill in the body of Alyssa's rule.
(rule (meeting-time ?person ?day-and-time)
<rule-body>)
c. Alyssa arrives at work on Wednesday morning and wonders what meetings she has to attend that day. Having defined the above rule, what query should she make to find this out?
(lives-near ?person (Hacker Alyssa P))
Alyssa P. Hacker is able to find people who live near her, with whom she can ride to work. On the other hand, when she tries to find all pairs of people who live near each other by querying
(lives-near ?person-1 ?person-2)
she notices that each pair of people who live near each other is listed twice; for example,
(lives-near (Hacker Alyssa P) (Fect Cy D))
(lives-near (Fect Cy D) (Hacker Alyssa P))
Why does this happen? Is there a way to find a list of people who live near each other, in which each pair appears only once? Explain.
We can regard a rule as a kind of logical implication: If an assignment of values to pattern variables satisfies the body, then it satisfies the conclusion. Consequently, we can regard the query language as having the ability to perform logical deductions based upon the rules. As an example, consider the append operation described at the beginning of section 4.4. As we said, append can be characterized by the following two rules:
To express this in our query language, we define two rules for a relation
(append-to-form x y z)
which we can interpret to mean "x and y append to form z":
(rule (append-to-form () ?y ?y))
(rule (append-to-form (?u . ?v) ?y (?u . ?z))
(append-to-form ?v ?y ?z))
The first rule has no body, which means that the conclusion holds for any value of ?y. Note how the second rule makes use of dotted-tail notation to name the car and cdr of a list.
Given these two rules, we can formulate queries that compute the append of two lists:
;;; Query input:
(append-to-form (a b) (c d) ?z)
;;; Query results:
(append-to-form (a b) (c d) (a b c d))
What is more striking, we can use the same rules to ask the question "Which list, when appended to (a b), yields (a b c d)?" This is done as follows:
;;; Query input:
(append-to-form (a b) ?y (a b c d))
;;; Query results:
(append-to-form (a b) (c d) (a b c d))
We can also ask for all pairs of lists that append to form (a b c d):
;;; Query input:
(append-to-form ?x ?y (a b c d))
;;; Query results:
(append-to-form () (a b c d) (a b c d))
(append-to-form (a) (b c d) (a b c d))
(append-to-form (a b) (c d) (a b c d))
(append-to-form (a b c) (d) (a b c d))
(append-to-form (a b c d) () (a b c d))
The query system may seem to exhibit quite a bit of intelligence in using the rules to deduce the answers to the queries above. Actually, as we will see in the next section, the system is following a well-determined algorithm in unraveling the rules. Unfortunately, although the system works impressively in the append case, the general methods may break down in more complex cases, as we will see in section 4.4.3.
(rule (?x next-to ?y in (?x ?y . ?u)))
(rule (?x next-to ?y in (?v . ?z))
(?x next-to ?y in ?z))
What will the response be to the following queries?
(?x next-to ?y in (1 (2 3) 4))
(?x next-to 1 in (2 1 3 1))
(son Adam Cain)
(son Cain Enoch)
(son Enoch Irad)
(son Irad Mehujael)
(son Mehujael Methushael)
(son Methushael Lamech)
(wife Lamech Ada)
(son Ada Jabal)
(son Ada Jubal)
Formulate rules such as "If S is the son of F, and F is the son of G, then S is the grandson of G" and "If W is the wife of M, and S is the son of W, then S is the son of M" (which was supposedly more true in biblical times than today) that will enable the query system to find the grandson of Cain; the sons of Lamech; the grandsons of Methushael. (See exercise 4.69 for some rules to deduce more complicated relationships.)