Now that we have seen all the elements of the compiler, let us examine an example of compiled code to see how things fit together. We will compile the definition of a recursive factorial procedure by calling compile:
(compile
'(define (factorial n)
(if (= n 1)
1
(* (factorial (- n 1)) n)))
'val
'next)
We have specified that the value of the define expression should be placed in the val register. We don't care what the compiled code does after executing the define, so our choice of next as the linkage descriptor is arbitrary.
Compile determines that the expression is a definition, so it calls compile-definition to compile code to compute the value to be assigned (targeted to val), followed by code to install the definition, followed by code to put the value of the define (which is the symbol ok) into the target register, followed finally by the linkage code. Env is preserved around the computation of the value, because it is needed in order to install the definition. Because the linkage is next, there is no linkage code in this case. The skeleton of the compiled code is thus
<save env if modified by code to compute value>
<compilation of definition value, target val, linkage next>
<restore env if saved above>
(perform (op define-variable!)
(const factorial)
(reg val)
(reg env))
(assign val (const ok))
The expression that is to be compiled to produce the value for the variable factorial is a lambda expression whose value is the procedure that computes factorials. Compile handles this by calling compile-lambda, which compiles the procedure body, labels it as a new entry point, and generates the instruction that will combine the procedure body at the new entry point with the run-time environment and assign the result to val. The sequence then skips around the compiled procedure code, which is inserted at this point. The procedure code itself begins by extending the procedure's definition environment by a frame that binds the formal parameter n to the procedure argument. Then comes the actual procedure body. Since this code for the value of the variable doesn't modify the env register, the optional save and restore shown above aren't generated. (The procedure code at entry2 isn't executed at this point, so its use of env is irrelevant.) Therefore, the skeleton for the compiled code becomes
(assign val (op make-compiled-procedure)
(label entry2)
(reg env))
(goto (label after-lambda1))
entry2
(assign env (op compiled-procedure-env) (reg proc))
(assign env (op extend-environment)
(const (n))
(reg argl)
(reg env))
<compilation of procedure body>
after-lambda1
(perform (op define-variable!)
(const factorial)
(reg val)
(reg env))
(assign val (const ok))
A procedure body is always compiled (by compile-lambda-body) as a sequence with target val and linkage return. The sequence in this case consists of a single if expression:
(if (= n 1)
1
(* (factorial (- n 1)) n))
Compile-if generates code that first computes the predicate (targeted to val), then checks the result and branches around the true branch if the predicate is false. Env and continue are preserved around the predicate code, since they may be needed for the rest of the if expression. Since the if expression is the final expression (and only expression) in the sequence making up the procedure body, its target is val and its linkage is return, so the true and false branches are both compiled with target val and linkage return. (That is, the value of the conditional, which is the value computed by either of its branches, is the value of the procedure.)
<save continue, env if modified by predicate and needed by branches>
<compilation of predicate, target val, linkage next>
<restore continue, env if saved above>
(test (op false?) (reg val))
(branch (label false-branch4))
true-branch5
<compilation of true branch, target val, linkage return>
false-branch4
<compilation of false branch, target val, linkage return>
after-if3
The predicate (= n 1) is a procedure call. This looks up the operator (the symbol =) and places this value in proc. It then assembles the arguments 1 and the value of n into argl. Then it tests whether proc contains a primitive or a compound procedure, and dispatches to a primitive branch or a compound branch accordingly. Both branches resume at the after-call label. The requirements to preserve registers around the evaluation of the operator and operands don't result in any saving of registers, because in this case those evaluations don't modify the registers in question.
(assign proc
(op lookup-variable-value) (const =) (reg env))
(assign val (const 1))
(assign argl (op list) (reg val))
(assign val (op lookup-variable-value) (const n) (reg env))
(assign argl (op cons) (reg val) (reg argl))
(test (op primitive-procedure?) (reg proc))
(branch (label primitive-branch17))
compiled-branch16
(assign continue (label after-call15))
(assign val (op compiled-procedure-entry) (reg proc))
(goto (reg val))
primitive-branch17
(assign val (op apply-primitive-procedure)
(reg proc)
(reg argl))
after-call15
The true branch, which is the constant 1, compiles (with target val and linkage return) to
(assign val (const 1))
(goto (reg continue))
The code for the false branch is another a procedure call, where the procedure is the value of the symbol *, and the arguments are n and the result of another procedure call (a call to factorial). Each of these calls sets up proc and argl and its own primitive and compound branches. Figure 5.17 shows the complete compilation of the definition of the factorial procedure. Notice that the possible save and restore of continue and env around the predicate, shown above, are in fact generated, because these registers are modified by the procedure call in the predicate and needed for the procedure call and the return linkage in the branches.
(define (factorial-alt n)
(if (= n 1)
1
(* n (factorial-alt (- n 1)))))
Compile this procedure and compare the resulting code with that produced for factorial. Explain any differences you find. Does either program execute more efficiently than the other?
(define (factorial n)
(define (iter product counter)
(if (> counter n)
product
(iter (* counter product)
(+ counter 1))))
(iter 1 1))
Annotate the resulting code, showing the essential difference between the code for iterative and recursive versions of factorial that makes one process build up stack space and the other run in constant stack space.
;; compute (- n 1), which is the argument for factorial
|
(assign val (op make-compiled-procedure) (label entry16)
|
after-call20
|
(assign val (op lookup-variable-value) (const a) (reg env))
(assign val (op +) (reg val) (const 1))
In this exercise we will extend our compiler to support open coding of selected primitives. Special-purpose code will be generated for calls to these primitive procedures instead of the general procedure-application code. In order to support this, we will augment our machine with special argument registers arg1 and arg2. The primitive arithmetic operations of the machine will take their inputs from arg1 and arg2. The results may be put into val, arg1, or arg2.
The compiler must be able to recognize the application of an open-coded primitive in the source program. We will augment the dispatch in the compile procedure to recognize the names of these primitives in addition to the reserved words (the special forms) it currently recognizes.noteMaking the primitives into reserved words is in general a bad idea, since a user cannot then rebind these names to different procedures. Moreover, if we add reserved words to a compiler that is in use, existing programs that define procedures with these names will stop working. See exercise 5.44 for ideas on how to avoid this problem. For each special form our compiler has a code generator. In this exercise we will construct a family of code generators for the open-coded primitives.
a. The open-coded primitives, unlike the special forms, all need their operands evaluated. Write a code generator spread-arguments for use by all the open-coding code generators. Spread-arguments should take an operand list and compile the given operands targeted to successive argument registers. Note that an operand may contain a call to an open-coded primitive, so argument registers will have to be preserved during operand evaluation.
b. For each of the primitive procedures =, *, -, and +, write a code generator that takes a combination with that operator, together with a target and a linkage descriptor, and produces code to spread the arguments into the registers and then perform the operation targeted to the given target with the given linkage. You need only handle expressions with two operands. Make compile dispatch to these code generators.
c. Try your new compiler on the factorial example. Compare the resulting code with the result produced without open coding.
d. Extend your code generators for + and * so that they can handle expressions with arbitrary numbers of operands. An expression with more than two operands will have to be compiled into a sequence of operations, each with only two inputs.