2011-12-18

SICP Exercise 2.63: Binary Trees to Ordered Lists

Each of the following two procedures converts a binary tree to a list.
(define (tree->list-1 tree)
  (if (null? tree)
      '()
      (append (tree->list-1 (left-branch tree))
              (cons (entry tree)
                    (tree->list-1 (right-branch tree))))))
(define (tree->list-2 tree)
  (define (copy-to-list tree result-list)
    (if (null? tree)
        result-list
        (copy-to-list (left-branch tree)
                      (cons (entry tree)
                            (copy-to-list (right-branch tree)
                                          result-list)))))
  (copy-to-list tree '()))
  1. Do the two procedures produce the same result for every tree? If not, how do the results differ? What lists do the two procedures produce for the trees in figure 2.16?
  2. Do the two procedures have the same order of growth in the number of steps required to convert a balanced tree with n elements to a list? If not, which one grows more slowly?
(a) What Do the Procedures Do?
Before we actually look in detail at the internals of the two procedures, let's see how they behave with various trees. It's worth noting that the "Sets as binary trees" section is actually using binary search trees to represent sets, not just binary trees. As a result we'll test the procedures using valid binary search trees:
> (tree->list-1 '())
'()
> (tree->list-2 '())
'()
> (tree->list-1 (make-tree 5 '() '()))
'(5)
> (tree->list-2 (make-tree 5 '() '()))
'(5)
> (tree->list-1 (make-tree 3 (make-tree 2 (make-tree 1 '() '()) '()) '()))
'(1 2 3)
> (tree->list-2 (make-tree 3 (make-tree 2 (make-tree 1 '() '()) '()) '()))
'(1 2 3)
> (tree->list-1 (make-tree 1 '() (make-tree 2 '() (make-tree 3 '() '()))))
'(1 2 3)
> (tree->list-2 (make-tree 1 '() (make-tree 2 '() (make-tree 3 '() '()))))
'(1 2 3)
> (tree->list-1 (make-tree 5
                           (make-tree 3
                                      (make-tree 1
                                                 '()
                                                 (make-tree 2 '() '()))
                                      (make-tree 4 '() '()))
                           (make-tree 6
                                      '()
                                      (make-tree 7 '() '()))))
'(1 2 3 4 5 6 7)
> (tree->list-2 (make-tree 5
                           (make-tree 3
                                      (make-tree 1
                                                 '()
                                                 (make-tree 2 '() '()))
                                      (make-tree 4 '() '()))
                           (make-tree 6
                                      '()
                                      (make-tree 7 '() '()))))
'(1 2 3 4 5 6 7)
As the exercise also asks what the two procedures produce for the trees in figure 2.16, let's try those out too:
> (tree->list-1 (make-tree 7
                           (make-tree 3
                                      (make-tree 1 '() '())
                                      (make-tree 5 '() '()))
                           (make-tree 9
                                      '()
                                      (make-tree 11 '() '()))))
'(1 3 5 7 9 11)
> (tree->list-2 (make-tree 7
                           (make-tree 3
                                      (make-tree 1 '() '())
                                      (make-tree 5 '() '()))
                           (make-tree 9
                                      '()
                                      (make-tree 11 '() '()))))
'(1 3 5 7 9 11)
> (tree->list-1 (make-tree 3
                           (make-tree 1 '() '())
                           (make-tree 7
                                      (make-tree 5 '() '())
                                      (make-tree 9
                                                 '()
                                                 (make-tree 11 '() '())))))
'(1 3 5 7 9 11)
> (tree->list-2 (make-tree 3
                           (make-tree 1 '() '())
                           (make-tree 7
                                      (make-tree 5 '() '())
                                      (make-tree 9
                                                 '()
                                                 (make-tree 11 '() '())))))
'(1 3 5 7 9 11)
> (tree->list-1 (make-tree 5
                           (make-tree 3
                                      (make-tree 1 '() '())
                                      '())
                           (make-tree 9
                                      (make-tree 7 '() '())
                                      (make-tree 11 '() '()))))
'(1 3 5 7 9 11)
> (tree->list-2 (make-tree 5
                           (make-tree 3
                                      (make-tree 1 '() '())
                                      '())
                           (make-tree 9
                                      (make-tree 7 '() '())
                                      (make-tree 11 '() '()))))
'(1 3 5 7 9 11)
Okay, so it looks from the results as if both procedures are performing the equivalent of in-order (or symmetric) tree traversals. Given that these are binary search trees containing numeric values, with the left sub-tree of a node containing only values numerically less than the current node's value and the right-sub-tree of a node containing only values numerically greater than the current node's value, this means that we're expecting both procedures to produce a list containing all of the node values from the tree in numerically ascending order. Let's now look at what each of the two procedures do in turn in order to confirm this.

The first procedure, tree->list-1, processes the child nodes of each node recursively, until it reaches the null children. For any non-null node it appends the tree->list-1 for the left node onto the result of consing the current node's value onto the tree->list-1 for the right node. Assuming that we're dealing with a valid binary search tree then, for any node, this will produce a list containing the current node's value at some index, i such that all the elements in the list before index i are numerically less than the current node's value, while all the elements in the list that are after index i are numerically greater than the current node's value. Given that this is done recursively, we can say that tree->list-1 will indeed produce a list containing all of the node values from the tree in numerically ascending order.

Now let's look at tree->list-1. This procedure (or at least the inner procedure, copy-to-list) is more iterative in nature, building up a result list as it goes along, and stopping once it's run out of tree to process. For any non-null node it recursively calls copy-to-list with the left node as the tree to process, and the result of consing the current node's value onto the list returned by recursively calling copy-to-list with the right node as the tree and the result list as built so far. With the same assumption as before this means that: firstly the head of the list produced by the cons is less than all of the elements in the tail of the list, and secondly all of the elements to be consed onto the head of this list by the call to copy-to-list with the left node as the tree are less than the head of the result list so far. As this is performed recursively, we can also say that tree->list-2 will produce a list containing all of the node values from the tree in numerically ascending order.

How Do the Procedures Grow?
Well, first of all both procedures visit each node once in their traversal of the tree. This means that there's going to be a factor of n in both orders of growth. However, they differ in how they build up the results list...

The first procedure, tree->list-1, uses append to join the sublist produced for the left child node onto the list produced by consing the current node's value onto the sublist produced for the right child node. Now append is an Θ(n) operation, as it must iterate through, in this application, the entirety of the sublist produced for the left child node, and use cons (which we're assuming to be an Θ(1) operation) to append each element in reverse order onto the sublist produced for the right child node. As append is invoked once for each of the n nodes in the tree, tree->list-1 is an Θ(n2) operation.

On the other hand, in each recursive call the second procedure, tree->list-2, uses cons to put the current node's value onto the result list. As noted above we're assuming this to be an Θ(1) operation and, as it's performed once per node, this is performed a total of n times. As a result, tree->list-2 is an Θ(n) operation, and so grows more slowly than tree->list-1.

A Weight off My Shoulders

SICP is quite heavy going... Or at least it's quite heavy going when you're trying to work through it in large chunks on a regular basis and keep a (more than) full-time job going as a software engineer. Not to mention having a life outside work as well...

You may have noticed I've been somewhat lax in keeping up with SICP. I'm not the only one. Several of my colleagues have not been able to keep up with the exercises and, when we reached the end of chapter 2, we took the decision to stop work through the book as a group and instead we'll work through the Berkeley SICP video lectures together. Working through the book is still encouraged, but has been left to individuals to drive their own progress.

So I've been taking a little break from the book. Hey, it's been somewhat busy both at work and home of late, and I've not found enough time to do any of my huge list of personal projects. However, I'm back! I might not manage to maintain the same velocity as I've done previously, but I still intend on working my way through the whole book and posting my solutions to the exercises.

Anyway, I've dropped the progress chart, as that no longer makes sense. And, going forward, the SICP exercise index won't have any week dates on it for the exercises, since they don't make sense either.

SICP Exercise 2.62: Unioning Sets with Ordered Representations

Give a Θ(n) implementation of union-set for sets represented as ordered lists.

Just like in exercise 2.59, the implementation of union-set is similar in many respects to intersection-set. We'll iterate through the two list representations in parallel, comparing the head items of each list at each iteration. However, unlike intersection-set, which only keeps an item if it is present in the heads of both lists, we want to keep all of the items from both lists - but we need to remove duplicates and ensure the resulting set is ordered.

So what are the cases we're going to deal with? Well:
  • If set1 is null then the union is just set2.
  • Conversely, if set2 is null then the union is just set1.
  • If the head items of both lists are the same then we can produce the union by consing that item onto the result of generating the union of the cdrs of both lists. As with intersection-set it doesn't really matter whether we use the head if set1 or set2 for the cons, so we'll arbitrarily use the head of set1.
  • If the head item of set1 precedes the head item of set2 then we can produce the union by consing the head item of set1 onto the result of generating the union of the cdr of set1 and set2.
  • Conversely, if the head item of set2 precedes the head item of set1 then we can produce the union by consing the head item of set2 onto the result of generating the union of set1 and the cdr of set2.
Here's the procedure:
(define (union-set set1 set2)
  (cond ((null? set1) set2)
        ((null? set2) set1)
        (else (let ((x1 (car set1))
                    (x2 (car set2)))
                (cond ((= x1 x2)
                       (cons x1
                             (union-set (cdr set1)
                                        (cdr set2))))
                      ((< x1 x2)
                       (cons x1
                             (union-set (cdr set1)
                                        set2)))
                      ((< x2 x1)
                       (cons x2
                             (union-set set1
                                        (cdr set2)))))))))
Let's see it in action:
> (union-set '() '())
'()
> (union-set '(1 2 3) '())
'(1 2 3)
> (union-set '() '(4 5 6))
'(4 5 6)
> (union-set '(1 2 3) '(4 5 6))
'(1 2 3 4 5 6)
> (union-set '(4 5 6) '(1 2 3))
'(1 2 3 4 5 6)
> (union-set '(1 3 5) '(2 4 6))
'(1 2 3 4 5 6)
> (union-set '(2 4 6) '(1 3 5))
'(1 2 3 4 5 6)
> (union-set '(1 2 3) '(2 3 4))
'(1 2 3 4)