Mutable List Structure

The basic operations on pairs -- cons, car, and cdr -- can be used to construct list structure and to select parts from list structure, but they are incapable of modifying list structure. The same is true of the list operations we have used so far, such as append and list, since these can be defined in terms of cons, car, and cdr. To modify list structures we need new operations.

The primitive mutators for pairs are set-car! and set-cdr!. Set-car! takes two arguments, the first of which must be a pair. It modifies this pair, replacing the car pointer by a pointer to the second argument of set-car!.^{noteSet-car! and set-cdr! return implementation-dependent values. Like set!, they should be used only for their effect.}

As an example, suppose that x is bound to the list ((a b) c d) and y to the list (e f) as illustrated in figure 3.12. Evaluating the expression (set-car! x y) modifies the pair to which x is bound, replacing its car by the value of y. The result of the operation is shown in figure 3.13. The structure x has been modified and would now be printed as ((e f) c d). The pairs representing the list (a b), identified by the pointer that was replaced, are now detached from the original structure.^{noteWe see from this that mutation operations on lists can create "garbage" that is not part of any accessible structure. We will see in section 5.3.2 that Lisp memory-management systems include a garbage collector, which identifies and recycles the memory space used by unneeded pairs.}

Compare figure 3.13 with figure 3.14, which illustrates the result of executing (define z (cons y (cdr x))) with x and y bound to the original lists of figure 3.12. The variable z is now bound to a new pair created by the cons operation; the list to which x is bound is unchanged.

The set-cdr! operation is similar to set-car!. The only difference is that the cdr pointer of the pair, rather than the car pointer, is replaced. The effect of executing (set-cdr! x y) on the lists of figure 3.12 is shown in figure 3.15. Here the cdr pointer of x has been replaced by the pointer to (e f). Also, the list (c d), which used to be the cdr of x, is now detached from the structure.

Cons builds new list structure by creating new pairs, while set-car! and set-cdr! modify existing pairs. Indeed, we could implement cons in terms of the two mutators, together with a procedure get-new-pair, which returns a new pair that is not part of any existing list structure. We obtain the new pair, set its car and cdr pointers to the designated objects, and return the new pair as the result of the cons.^{noteGet-new-pair is one of the operations that must be implemented as part of the memory management required by a Lisp implementation. We will discuss this in section 5.3.1.}

(define (cons x y) (let ((new (get-new-pair))) (set-car! new x) (set-cdr! new y) new))

Exercise 3.12. The following procedure for appending lists was introduced in section 2.2.1:

(define (append x y) (if (null? x) y (cons (car x) (append (cdr x) y))))

Append forms a new list by successively consing the elements of x onto y. The procedure append! is similar to append, but it is a mutator rather than a constructor. It appends the lists by splicing them together, modifying the final pair of x so that its cdr is now y. (It is an error to call append! with an empty x.)

(define (append! x y) (set-cdr! (last-pair x) y) x)

Here last-pair is a procedure that returns the last pair in its argument:

(define (last-pair x) (if (null? (cdr x)) x (last-pair (cdr x))))

Consider the interaction

(define x (list 'a 'b)) (define y (list 'c 'd)) (define z (append x y)) z (a b c d) (cdr x) <response> (define w (append! x y)) w (a b c d) (cdr x) <response>

What are the missing <response>s? Draw box-and-pointer diagrams to explain your answer.

Exercise 3.13. Consider the following make-cycle procedure, which uses the last-pair procedure defined in exercise 3.12:

(define (make-cycle x) (set-cdr! (last-pair x) x) x)

Draw a box-and-pointer diagram that shows the structure z created by

(define z (make-cycle (list 'a 'b 'c)))

What happens if we try to compute (last-pair z)?

Exercise 3.14. The following procedure is quite useful, although obscure:

(define (mystery x) (define (loop x y) (if (null? x) y (let ((temp (cdr x))) (set-cdr! x y) (loop temp x)))) (loop x '()))

Loop uses the "temporary" variable temp to hold the old value of the cdr of x, since the set-cdr! on the next line destroys the cdr. Explain what mystery does in general. Suppose v is defined by (define v (list 'a 'b 'c 'd)). Draw the box-and-pointer diagram that represents the list to which v is bound. Suppose that we now evaluate (define w (mystery v)). Draw box-and-pointer diagrams that show the structures v and w after evaluating this expression. What would be printed as the values of v and w ?

Sharing and identity

We mentioned in section 3.1.3 the theoretical issues of "sameness" and "change" raised by the introduction of assignment. These issues arise in practice when individual pairs are shared among different data objects. For example, consider the structure formed by

(define x (list 'a 'b)) (define z1 (cons x x))

As shown in figure 3.16, z1 is a pair whose car and cdr both point to the same pair x. This sharing of x by the car and cdr of z1 is a consequence of the straightforward way in which cons is implemented. In general, using cons to construct lists will result in an interlinked structure of pairs in which many individual pairs are shared by many different structures.

**Figure 3.16:** The list `z1` formed by `(cons x x)`.

**Figure 3.17:** The list `z2` formed by `(cons (list 'a 'b) (list 'a 'b))`.

In contrast to figure 3.16, figure 3.17 shows the structure created by

(define z2 (cons (list 'a 'b) (list 'a 'b)))

In this structure, the pairs in the two (a b) lists are distinct, although the actual symbols are shared.^{noteThe two pairs are distinct because each call to cons returns a new pair. The symbols are shared; in Scheme there is a unique symbol with any given name. Since Scheme provides no way to mutate a symbol, this sharing is undetectable. Note also that the sharing is what enables us to compare symbols using eq?, which simply checks equality of pointers.}

When thought of as a list, z1 and z2 both represent "the same" list, ((a b) a b). In general, sharing is completely undetectable if we operate on lists using only cons, car, and cdr. However, if we allow mutators on list structure, sharing becomes significant. As an example of the difference that sharing can make, consider the following procedure, which modifies the car of the structure to which it is applied:

(define (set-to-wow! x) (set-car! (car x) 'wow) x)

Even though z1 and z2 are "the same" structure, applying set-to-wow! to them yields different results. With z1, altering the car also changes the cdr, because in z1 the car and the cdr are the same pair. With z2, the car and cdr are distinct, so set-to-wow! modifies only the car:

z1 ((a b) a b) (set-to-wow! z1) ((wow b) wow b) z2 ((a b) a b) (set-to-wow! z2) ((wow b) a b)

One way to detect sharing in list structures is to use the predicate eq?, which we introduced in section 2.3.1 as a way to test whether two symbols are equal. More generally, (eq? x y) tests whether x and y are the same object (that is, whether x and y are equal as pointers). Thus, with z1 and z2 as defined in figures 3.16 and 3.17, (eq? (car z1) (cdr z1)) is true and (eq? (car z2) (cdr z2)) is false.

As will be seen in the following sections, we can exploit sharing to greatly extend the repertoire of data structures that can be represented by pairs. On the other hand, sharing can also be dangerous, since modifications made to structures will also affect other structures that happen to share the modified parts. The mutation operations set-car! and set-cdr! should be used with care; unless we have a good understanding of how our data objects are shared, mutation can have unanticipated results.^{noteThe subtleties of dealing with sharing of mutable data objects reflect the underlying issues of "sameness" and "change" that were raised in section 3.1.3. We mentioned there that admitting change to our language requires that a compound object must have an "identity" that is something different from the pieces from which it is composed. In Lisp, we consider this "identity" to be the quality that is tested by eq?, i.e., by equality of pointers. Since in most Lisp implementations a pointer is essentially a memory address, we are "solving the problem" of defining the identity of objects by stipulating that a data object "itself" is the information stored in some particular set of memory locations in the computer. This suffices for simple Lisp programs, but is hardly a general way to resolve the issue of "sameness" in computational models.}

Exercise 3.15. Draw box-and-pointer diagrams to explain the effect of set-to-wow! on the structures z1 and z2 above.

Exercise 3.16. Ben Bitdiddle decides to write a procedure to count the number of pairs in any list structure. "It's easy," he reasons. "The number of pairs in any structure is the number in the car plus the number in the cdr plus one more to count the current pair." So Ben writes the following procedure:

(define (count-pairs x) (if (not (pair? x)) 0 (+ (count-pairs (car x)) (count-pairs (cdr x)) 1)))

Show that this procedure is not correct. In particular, draw box-and-pointer diagrams representing list structures made up of exactly three pairs for which Ben's procedure would return 3; return 4; return 7; never return at all.

Exercise 3.17. Devise a correct version of the count-pairs procedure of exercise 3.16 that returns the number of distinct pairs in any structure. (Hint: Traverse the structure, maintaining an auxiliary data structure that is used to keep track of which pairs have already been counted.)

Exercise 3.18. Write a procedure that examines a list and determines whether it contains a cycle, that is, whether a program that tried to find the end of the list by taking successive cdrs would go into an infinite loop. Exercise 3.13 constructed such lists.

Exercise 3.19. Redo exercise 3.18 using an algorithm that takes only a constant amount of space. (This requires a very clever idea.)

Mutation is just assignment

When we introduced compound data, we observed in section 2.1.3 that pairs can be represented purely in terms of procedures:

(define (cons x y) (define (dispatch m) (cond ((eq? m 'car) x) ((eq? m 'cdr) y) (else (error "Undefined operation -- CONS" m)))) dispatch) (define (car z) (z 'car)) (define (cdr z) (z 'cdr))

The same observation is true for mutable data. We can implement mutable data objects as procedures using assignment and local state. For instance, we can extend the above pair implementation to handle set-car! and set-cdr! in a manner analogous to the way we implemented bank accounts using make-account in section 3.1.1:

(define (cons x y) (define (set-x! v) (set! x v)) (define (set-y! v) (set! y v)) (define (dispatch m) (cond ((eq? m 'car) x) ((eq? m 'cdr) y) ((eq? m 'set-car!) set-x!) ((eq? m 'set-cdr!) set-y!) (else (error "Undefined operation -- CONS" m)))) dispatch) (define (car z) (z 'car)) (define (cdr z) (z 'cdr)) (define (set-car! z new-value) ((z 'set-car!) new-value) z) (define (set-cdr! z new-value) ((z 'set-cdr!) new-value) z)

Assignment is all that is needed, theoretically, to account for the behavior of mutable data. As soon as we admit set! to our language, we raise all the issues, not only of assignment, but of mutable data in general.^{noteOn the other hand, from the viewpoint of implementation, assignment requires us to modify the environment, which is itself a mutable data structure. Thus, assignment and mutation are equipotent: Each can be implemented in terms of the other.}