2011-12-30

SICP Exercise 2.74: Insatiable Insanity

Insatiable Enterprises, Inc., is a highly decentralized conglomerate company consisting of a large number of independent divisions located all over the world. The company's computer facilities have just been interconnected by means of a clever network-interfacing scheme that makes the entire network appear to any user to be a single computer. Insatiable's president, in her first attempt to exploit the ability of the network to extract administrative information from division files, is dismayed to discover that, although all the division files have been implemented as data structures in Scheme, the particular data structure used varies from division to division. A meeting of division managers is hastily called to search for a strategy to integrate the files that will satisfy headquarters' needs while preserving the existing autonomy of the divisions.

Show how such a strategy can be implemented with data-directed programming. As an example, suppose that each division's personnel records consist of a single file, which contains a set of records keyed on employees' names. The structure of the set varies from division to division. Furthermore, each employee's record is itself a set (structured differently from division to division) that contains information keyed under identifiers such as address and salary. In particular:
  1. Implement for headquarters a get-record procedure that retrieves a specified employee's record from a specified personnel file. The procedure should be applicable to any division's file. Explain how the individual divisions' files should be structured. In particular, what type information must be supplied?
  2. Implement for headquarters a get-salary procedure that returns the salary information from a given employee's record from any division's personnel file. How should the record be structured in order to make this operation work?
  3. Implement for headquarters a find-employee-record procedure. This should search all the divisions' files for the record of a given employee and return the record. Assume that this procedure takes as arguments an employee's name and a list of all the divisions' files.
  4. When Insatiable takes over a new company, what changes must be made in order to incorporate the new personnel information into the central system?
Before we even begin on the exercise, it's worth noting that keying employment records by employee name is generally a bad idea. Names are not guaranteed to be unique identifiers of people. As a result Insatiable Enterprises may find itself in a situation where two employees in the same division share one employment record, where a division cannot employ someone because the division already employs someone with the same name, or where searching for an employee using find-employee-record returns only the first record encountered in all of the divisions' files, despite there being employees in different divisions with the same name.

Human Resource Management Systems (HRMS) that have had even the smallest amount of thought put into their data model will uses something like an employee ID or a government-issued ID (such as a social security number or national insurance number) as a key. Such IDs are intended to be unique, and so avoid the collisions.

Anyway, we'll run with employee name as the key since that's what the exercise states...

(a) Retrieving an Employee's Record for a Given Division
Before we start to address get-record itself, we should first have a think about the general problem. We have a potentially large number of divisions, each of which structures its data file in a different way. We wish to expose a common interface for accessing this data, regardless of which division the data is from, and so abstract away these differences. In order to do this we will need to be able to apply operations (such as get-record) to any of the divisions' files and successfully navigate the data structures and perform the appropriate processing.

As noted in the exercise, a strategy based upon data-directed programming can be used to achieve this without ending up with a large amount of per-data structure logic (and a large maintenance headache!) embedded in each procedure in our common interface. Instead we can install operations on a per-type basis in a table, tag each data structure with an appropriate type tag and then use the tag embedded in a particular piece of data to retrieve the appropriate version of an operation we wish to apply.

In section 2.4.3 the authors provide us with two procedures for installing and retrieving operations: get and put respectively (and, if you're using DrRacket, I gave the appropriate magic for getting these working in the last exercise). These two procedures have a pair of common operands: the name of the operation and the type of the data the operation is applied to. Given this, it seems obvious that the name of the operation should correspond to the operation we are trying to apply (such as get-record), while the type of the data should be a unique per-division identifier. If we continue with the "names as keys" theme then this could be as simple as the division's name.

Section 2.4.3 also shows us that, provided the data is tagged with the type, we can then extract the type from the data and use that to retrieve operations appropriate to the data type. This then answers the question posed in the exercise: "what type information must be supplied?" Each division's file must be tagged with the division's identifier under which its operations are registered. We can perform the tagging and extract both the tag and the data using the procedures attach-tag, type-tag and contents, which were introduced in section 2.4.2.

Now let's start thinking about implementing get-record...

First let's extend the statement from the previous paragraph a bit further. Not only must each division's file must be tagged with the division's identifier under which its operations are registered, but any data structure returned by our common interface to which further data-directed operations may be applied must be tagged with the division's identifier.

For example, get-record is going to return a record to which get-salary may be applied. In order for get-salary to find the appropriate per-division operation to apply to retrieve the salary the record itself must be tagged with the division's identifier. Rather than requiring each division's implementation of get-record to perform this tagging we can actually move the tagging into the common get-record implementation, simplifying the process.

We also need to think about what we want the actual interface to be. Let's make it a requirement that each division's get-record operation returns the record (untagged) if it's present, and returns #f if it's not. That way we can try to retrieve the record, check to see if we've got it and, if so, tag it with the division's tag.

That should give us enough to implement get-record:
(define (get-record employee-name personnel-file)
  (let ((record ((get 'get-record (type-tag personnel-file))
                   employee-name
                   (contents personnel-file))))
    (if record
        (attach-tag (type-tag personnel-file) record)
        #f)))
We'll test these out at the end, where I'll also introduce a couple of divisions' personnel files and some extensions.

(b) Retrieving an Employee's Salary from their Employee Record
It's worth noting that we've already answered the question posed in this part of the exercise: "How should the record be structured in order to make this operation work?" We previously stated that "...any data structure returned by our common interface to which further data-directed operations may be applied must be tagged with the division's identifier" and ensured that the implementation of get-record applied the tag for us automatically.

Given that employee records returned by the common interface are tagged in this way we can then similarly use the tag to retrieve a pre-installed per-division operation with the appropriate name (i.e. 'get-salary) and then apply that to retrieve the employee's salary. In this case, however, get-salary doesn't need to tag it with the division's identifier, as the salary is not a "data structure returned by our common interface to which further data-directed operations may be applied" - it's simply a numeric value.

All we need to do then for get-salary is to use the tag from the record to get the appropriate operation to apply to extract the salary and then apply this operation to the content of the record:
(define (get-salary record)
  ((get 'get-salary (type-tag record)) (contents record)))
We can similarly produce a get-address procedure:
(define (get-address record)
  ((get 'get-address (type-tag record)) (contents record)))

(c) Finding an Employee Across Multiple Divisions
For this part we're told to assume that the find-employee-record procedure we implement takes not only the employee name, but also a list of all the divisions' files as arguments. Personally I'd prefer it if the latter argument weren't necessary and that part of installing a division's procedures were also registering the file. If I've got some spare time I'll post a followup showing how this could be achieved. However, in the meantime, we've got a list of files...

Now we've already written a procedure, get-record, that retrieves a employee's record by employee name from a specified division's file or, by our definition, returns #f if the record is not present. So finding an employee's record across a list of divisions' files should just be a case of iterating through the files, trying to find the employee's record in each one, returning it as soon as we find it or returning #f if it's not present in any of the divisions' files.

Here's one implementation that will achieve this:
(define (find-employee-record employee-name personnel-files)
  (if (null? personnel-files)
      #f
      (let ((record (get-record employee-name (car personnel-files))))
        (if record
            record
            (find-employee-record employee-name (cdr personnel-files))))))
(d) Insatiable Ingesting Another Company
Given the framework we've put in place this should provide a fairly low bar for Insatiable Enterprises, Inc. to incorporate the personnel files of a newly acquired company into its systems... Assuming, of course, that they happen to have implemented their HRMS in Scheme, or in a way that is easily interfaced to Scheme. All that should be required is to:
  1. Define a unique division ID under which procedures specific to the division can be installed.
  2. Write the set of procedures that comply with the common interface requirements and that retrieve the required information from the division's personnel file.
  3. Write a procedure to install these procedures as operations under the appropriate keys.
  4. Invoke this procedure, installing the operations.
  5. Update the list of division files to include the new division's file, suitably tagged.
Note the last step. I've assumed there's a list of all divisions' files kicking around somewhere that has each division's file tagged with its division ID so that this can be used for find-employee-record.

Putting it Into Practice
Okay, that's the common interface defined and we've also specified a set of steps for adding a newly acquired company's personnel file into the system. Let's put it into practice. We'll define a couple of divisions' personnel files using different data structures, give the divisions IDs, produce the required common interface procedures, install them as operations, build a list of divisions' files and try it out...

For our first division's representation we can take the binary search tree set representation introduced in the subsection Sets as binary trees, and the lookup mechanism introduced in the subsection Sets and information retrieval, and modify it so that it supports Scheme String keys (using the string=?, string<? and string>? comparators):
(define (entry tree) (car tree))
(define (left-branch tree) (cadr tree))
(define (right-branch tree) (caddr tree))
(define (make-tree entry left right)
  (list entry left right))

(define (list->tree elements)
  (car (partial-tree elements (length elements))))

(define (partial-tree elts n)
  (if (= n 0)
      (cons '() elts)
      (let ((left-size (quotient (- n 1) 2)))
        (let ((left-result (partial-tree elts left-size)))
          (let ((left-tree (car left-result))
                (non-left-elts (cdr left-result))
                (right-size (- n (+ left-size 1))))
            (let ((this-entry (car non-left-elts))
                  (right-result (partial-tree (cdr non-left-elts)
                                              right-size)))
              (let ((right-tree (car right-result))
                    (remaining-elts (cdr right-result)))
                (cons (make-tree this-entry left-tree right-tree)
                      remaining-elts))))))))

(define (key record)
  (car record))

(define (lookup given-key set-of-records)
  (if (null? set-of-records)
      #f
      (let ((entry-key (key (entry set-of-records))))
        (cond ((string=? given-key entry-key) (entry set-of-records))
              ((string<? given-key entry-key)
               (lookup given-key (left-branch set-of-records)))
              ((string>? given-key entry-key)
               (lookup given-key (right-branch set-of-records)))))))
Using this we can unveil to the world Unquenchable Industries, which uses a tree keyed on employee name to hold each record, with each record consisting of a tree where the fields are keyed by field name (i.e. "address" and "salary"):
              
(define unquenchable-industries-employees
  (list->tree (list (cons "John Doe"
                          (list->tree (list (cons "address" "55 Ghetto Grove")
                                            (cons "salary" 100))))
                    (cons "Myddle Mann"
                          (list->tree (list (cons "address" "24 Suburb Street")
                                            (cons "salary" 40000))))
                    (cons "Ritchie Rich"
                          (list->tree (list (cons "address" "1 Park Avenue")
                                            (cons "salary" 250000)))))))
We'll give our division the following ID:
(define unquenchable-industries-id 'unquenchable-industries)
Now let's implement our common interface procedures. There are just three of these in our definitions above: get-record, get-salary and get-address. The functionality we require of the 'get-record operation is provided directly by lookup - it looks up an entry by its key, returning the entry if it's present or #f if it's not. Note that lookup returns a pair with the key as the first element and the data as the second. This means that our 'get-salary and 'get-address operations will need to first extract the tree holding the record, lookup the appropriate field and, if it's present, extract the data it holds. We'll need to wrap all these up in a procedure that installs them under the appropriate operation and division keys.

Here's the procedures defined and installed:
(define (install-unquenchable-industries)
  (define (get-salary record)
    (let ((salary (lookup "salary" (cdr record))))
      (if salary
          (cdr salary)
          #f)))
  (define (get-address record)
    (let ((address (lookup "address" (cdr record))))
      (if address
          (cdr address)
          #f)))
  ; Install the procedures
  (put 'get-record unquenchable-industries-id lookup)
  (put 'get-salary unquenchable-industries-id get-salary)
  (put 'get-address unquenchable-industries-id get-address))

(install-unquenchable-industries)
As noted above, the last step is to tag the division's file and create the list of divisions' files:
(define personnel-files
  (list
   (attach-tag unquenchable-industries-id unquenchable-industries-employees)))
Using this we can find employee records and extract information about them:
> (get-salary (find-employee-record "Ritchie Rich" personnel-files))
250000
> (get-address (find-employee-record "John Doe" personnel-files))
"55 Ghetto Grove"
Now in a dramatic and stock-market unsettling move, Insatiable Industries, Inc. acquires Rapacious Ventures. Rapacious Ventures happen to also use a Scheme-based HRMS but, being a bit more forward thinking, they use nested R6RS hashtables. Here's Rapacious Ventures' personnel file:
(require rnrs/hashtables-6)

(define rapacious-ventures-employees (make-hashtable string-hash string=?))
(hashtable-set! rapacious-ventures-employees
           "Fred Blogs"
           (let ((h (make-hashtable string-hash string=?)))
             (hashtable-set! h "salary" 12345)
             (hashtable-set! h "address" "123 Wibble Street")
             h))
(hashtable-set! rapacious-ventures-employees
           "John Smith"
           (let ((h (make-hashtable string-hash string=?)))
             (hashtable-set! h "salary" 30000)
             (hashtable-set! h "address" "42 Nowhere Road")
             h))
(hashtable-set! rapacious-ventures-employees
           "Jack Copper"
           (let ((h (make-hashtable string-hash string=?)))
             (hashtable-set! h "salary" 25000)
             (hashtable-set! h "address" "99 Letsbee Avenue")
             h))
Let's bring Rapacious Ventures into the Insatiable Enterprises, Inc. fold...

First we give the new division an ID:
(define rapacious-ventures-id 'rapacious-ventures)
Next we implement and install the common interface procedures. We can look up entries in their personnel file using hashtable-ref, which takes the hashtable to use, the key we want to lookup and a default value to return if the key is not present. If we're happy with the employee record missing the employee's name then this is pretty close to the 'get-record operation, so we can simply wrap this in a procedure that wires through the appropriate operands, and fixing #f as the default value. Similarly, our implementations for the 'get-salary and 'get-address operations can simply call hashtable-ref with the appropriate key:
(define (install-rapacious-ventures)
  (define (get-record employee-name personnel-file)
    (hashtable-ref personnel-file employee-name #f))
  (define (get-salary record)
    (hashtable-ref record "salary" #f))
  (define (get-address record)
    (hashtable-ref record "address" #f))
  ; Install the procedures
  (put 'get-record rapacious-ventures-id get-record)
  (put 'get-salary rapacious-ventures-id get-salary)
  (put 'get-address rapacious-ventures-id get-address))

(install-rapacious-ventures)
Finally, we extend the tag the division's file and add it in to the the list of divisions' files:
(define personnel-files
  (list
   (attach-tag unquenchable-industries-id unquenchable-industries-employees)
   (attach-tag rapacious-ventures-id rapacious-ventures-employees)))
Using this we should still be able to access Unquenchable Industries' employee records and access Rapacious Ventures employee records:
> (get-salary (find-employee-record "Myddle Mann" personnel-files))
40000
> (get-address (find-employee-record "Ritchie Rich" personnel-files))
"1 Park Avenue"
> (get-salary (find-employee-record "John Smith" personnel-files))
30000
> (get-address (find-employee-record "Jack Copper" personnel-files))
"99 Letsbee Avenue"

2011-12-24

SICP Exercise 2.73: Data-Directed Derivatives

Section 2.3.2 described a program that performs symbolic differentiation:
(define (deriv exp var)
  (cond ((number? exp) 0)
        ((variable? exp) (if (same-variable? exp var) 1 0))
        ((sum? exp)
         (make-sum (deriv (addend exp) var)
                   (deriv (augend exp) var)))
        ((product? exp)
         (make-sum
           (make-product (multiplier exp)
                         (deriv (multiplicand exp) var))
           (make-product (deriv (multiplier exp) var)
                         (multiplicand exp))))
        (else (error "unknown expression type -- DERIV" exp))))
We can regard this program as performing a dispatch on the type of the expression to be differentiated. In this situation the "type tag" of the datum is the algebraic operator symbol (such as +) and the operation being performed is deriv. We can transform this program into data-directed style by rewriting the basic derivative procedure as
(define (deriv exp var)
   (cond ((number? exp) 0)
         ((variable? exp) (if (same-variable? exp var) 1 0))
         (else ((get 'deriv (operator exp)) (operands exp)
                                            var))))
(define (operator exp) (car exp))
(define (operands exp) (cdr exp))
  1. Explain what was done above. Why can't we assimilate the predicates number? and same-variable? into the data-directed dispatch?
  2. Write the procedures for derivatives of sums and products, and the auxiliary code required to install them in the table used by the program above.
  3. Choose any additional differentiation rule that you like, such as the one for exponents (exercise 2.56), and install it in this data-directed system.
  4. In this simple algebraic manipulator the type of an expression is the algebraic operator that binds it together. Suppose, however, we indexed the procedures in the opposite way, so that the dispatch line in deriv looked like
    ((get (operator exp) 'deriv) (operands exp) var)
    
    What corresponding changes to the derivative system are required?
Before we get started let's give ourselves the get and put procedures we need... As the book notes, we discuss the construction of such procedures in section 3.3.3. To save you looking ahead, here's the appropriate code you'll need if you're following along... I'll note (yet again) that I'm using DrRacket as my scheme interpreter - in order to get this to fly I had to include a couple of packages:
#lang racket

(require rnrs/base-6)
(require rnrs/mutable-pairs-6)

(define (assoc key records)
  (cond ((null? records) false)
        ((equal? key (caar records)) (car records))
        (else (assoc key (cdr records)))))

(define (make-table)
  (let ((local-table (list '*table*)))
    (define (lookup key-1 key-2)
      (let ((subtable (assoc key-1 (cdr local-table))))
        (if subtable
            (let ((record (assoc key-2 (cdr subtable))))
              (if record
                  (cdr record)
                  false))
            false)))
    (define (insert! key-1 key-2 value)
      (let ((subtable (assoc key-1 (cdr local-table))))
        (if subtable
            (let ((record (assoc key-2 (cdr subtable))))
              (if record
                  (set-cdr! record value)
                  (set-cdr! subtable
                            (cons (cons key-2 value)
                                  (cdr subtable)))))
            (set-cdr! local-table
                      (cons (list key-1
                                  (cons key-2 value))
                            (cdr local-table)))))
      'ok)    
    (define (dispatch m)
      (cond ((eq? m 'lookup-proc) lookup)
            ((eq? m 'insert-proc!) insert!)
            (else (error "Unknown operation -- TABLE" m))))
    dispatch))

(define operation-table (make-table))
(define get (operation-table 'lookup-proc))
(define put (operation-table 'insert-proc!))

(a) What Was Done

In the original version the type of an expression was tested through various predicates. Two of these, sum? and product?, relied upon the prefix notation used for the expressions to extract the car of the expression and compare it to the symbol corresponding to the operation. In order to move to a data-directed style of implementation a new procedure, operator, has been introduced to extract the symbol corresponding to the current operation. A corresponding procedure, operands, has also been introduced that extracts the operands for the operation. This also makes use of the fact that expressions defined using prefix notation, simply returning the cdr of the expression.

The former of these two procedures is used to obtain the type tag (i.e. '+ or '*) to use with get to retrieve the appropriate 'deriv procedure to apply. The latter procedure is then used to obtain the operands to pass to the obtained procedure, and the obtained procedure is invoked.

This allows the deriv procedure to support differentiation of any number of different operations to be supported without requiring continual updates to the procedure - provided, of course, that the operations are expressed using prefix notation. Adding a new operation is simply a case of registering the corresponding procedure under the 'deriv operation with the appropriate symbol corresponding to the type.

Unfortunately, number? and variable? cannot be handled with the implementation as it stands, as these are not expressed as a list with an operator as their first element and operands as their tail. As a result the implementations of operator and operands will not be able to cope with them - the interpreter will raise an error if they are applied to numbers or variables. This prevents us from registering appropriate procedures to cope with them (and so, by proxy, prevents us from encapsulating same-variable? in a procedure and registering that).

Note that we could provide a work-around for this by extending operator and operands appropriately. We could extend operator so that it checks whether or not an expression is a number or variable before returning the car of the expression. If it's either of the two then we could return proxy operations (e.g. 'number and 'variable) as appropriate. This would allow us to register procedures. Similarly, we could extend operator so that it checks whether it's dealing with a list or not before returning the cdr of the expression. If it's not a list then we can simply return a single element list. This would lead to the following implementation:
(define (deriv exp var)
  ((get 'deriv (operator exp)) (operands exp) var))

(define (operator exp)
  (cond ((number? exp) 'number)
        ((variable? exp) 'variable)
        (else (car exp))))

(define (operands exp)
  (if (pair? exp)
      (cdr exp)
      (list exp)))
Following the packaging mechanism introduced in section 2.4.3 we could then install the appropriate procedures as follows:
(define (install-number-routines)
  (define (derivative ops var) 0)
  (put 'deriv 'number derivative))
    
(define (install-variable-routines)
  (define (derivative ops var)
    (if (same-variable? (car ops) var) 1 0))
  (put 'deriv 'variable derivative))

(install-number-routines)
(install-variable-routines)
We won't use this though - we'll stick to the exercise!

(b) Data-Directed Derivatives of Sums and Products

The procedures for derivatives of sums and products correspond very closely to the appropriate consequent expressions in their corresponding clauses in the original implementations. However, note that the implementations of addend, augend, multiplier and multiplicand provided in the book will not work for us here as they assume that they are operating on a complete expression in prefix form (i.e. a list of 3 elements with the operator as the first element and the operands as the second and third elements), whereas the procedures will be dealing with a list containing just the operands.

Rather than redefine the selectors, we'll simply process the operands directly. We can access the first operand using car, and the second operand using cadr. So we simply replace invocations of addend and multiplier on the expression with caring the operands and, similarly, replace invocations of augend and multiplicand on the expression with cadring the operands.

We then need to install the procedures into the table used by the program. We saw an example of doing this above, where we saw how we could deal with number? and variable? by abusing the procedures operator and operands... You may have noticed above that the inner procedures that were actually installed above were both called derivative rather than deriv. There was a reason for that, and it's here that we can see the reason...

The calculation of derivatives for both sums and products requires recursive calls to the deriv procedure defined above. If we don't rename the inner procedures that calculate the derivatives for sums and products to something other than deriv then the in-scope association for deriv will be to the inner procedure being defined, and so the inner procedures' recursive calls will be to themselves. To avoid this (which would cause the calls to ultimately fail) we must rename the inner procedures to something other than deriv. I chose to go with the expanded name, derivative, and applied this name change to the procedures for dealing with numbers and variables too in order to maintain consistency among the packages.

Anyway, here are the packages for sums and products:
(define (install-sum-routines)
  (define (derivative ops var)
    (make-sum
     (deriv (car ops) var)
     (deriv (cadr ops) var)))
  (put 'deriv '+ derivative))

(define (install-product-routines)
  (define (derivative ops var)
    (make-sum
     (make-product (car ops)
                   (deriv (cadr ops) var))
     (make-product (deriv (car ops) var)
                   (cadr ops))))
  (put 'deriv '* derivative))

(install-sum-routines)
(install-product-routines)
As usual, we'll test them out. In this case I got a somewhat unexpected result for the third one:
> (deriv '(+ x 3) 'x)
1
> (deriv '(* x y) 'x)
'y
> (deriv '(* (* x y) (+ x 3)) 'x)
(mcons
 '+
 (mcons
  (mcons '* (mcons 'x (mcons 'y '())))
  (mcons (mcons '* (mcons 'y (mcons (mcons '+ (mcons 'x (mcons 3 '()))) '()))) '())))
If you recall from the book, the result for calculating the derivative of the last expression should be as follows:
> (deriv '(* (* x y) (+ x 3)) 'x)
'(+ (* x y) (* y (+ x 3)))
Now if you look closely at the result we actually get then you'll notice that there is some similarity to the expected result. In fact, if you were to replace mcons with cons in the produced result and evaluate it then you would get the expected result.

So what's going on? Well, the R6RS specification banished procedures such as set-cdr! to their own library, as these can give rise to mutable pairs. LISP and Scheme are often referred to as functional programming languages and it's my understanding that this change in R6RS is an effort to isolate procedures that have side-effects from the standard language, as these are not true functions. It's also my understanding that including this library under DrRacket effectively results in replacing the normal (immutable pair) implementations of cons, car, etc. with their mutable pair equivalents, which includes replacing cons with mcons, and that's what we're seeing here.

If you want more details on this then you can read your way through the DrRacket team's original post where they discuss these changes. Googling for "mutable pair racket" will also throw up a bunch of results that will give you further info.

(c) Data-Directed Derivatives of Exponents

In exercise 2.56 we extended the original deriv procedure to support exponents by using '** as the operator representing an exponent and adding in the following clause into the cond:
        …
        ((exponentiation? exp)
         (make-product
           (make-product (exponent exp)
                         (make-exponentiation (base exp)
                                              (make-sum (exponent exp) -1)))
           (deriv (base exp) var)))
        …
Converting this to the data-directed version is again straightforward. The two operands to the operator are the base and exponent, in that order, so we can use car and cadr to extract these in an analogous manner to sums and products and so convert the consequent expression into a derivative procedure. All that remains is to install it into the table for the 'deriv operation under the '** type. This gives the following implementation:
(define (install-exponent-routines)
  (define (derivative ops var)
    (make-product
     (make-product (cadr ops)
                   (make-exponentiation (car ops)
                                        (make-sum (cadr ops) -1)))
     (deriv (car ops) var)))
  (put 'deriv '** derivative))
After installing this we can - in order to get this to fly I st examples as we used in exercise 2.56:
> (deriv (make-exponentiation 'a 5) 'a)
(mcons '* (mcons 5 (mcons (mcons '** (mcons 'a (mcons 4 '()))) '())))
> (deriv (make-exponentiation 'a 'b) 'a)
(mcons
 '*
 (mcons
  'b
  (mcons (mcons '** (mcons 'a (mcons (mcons '+ (mcons 'b (mcons -1 '()))) '()))) '())))
> (deriv (make-exponentiation 'a (make-sum 'a 'b)) 'a)
(mcons
 '*
 (mcons
  (mcons '+ (mcons 'a (mcons 'b '())))
  (mcons
   (mcons
    '**
    (mcons
     'a
     (mcons (mcons '+ (mcons (mcons '+ (mcons 'a (mcons 'b '()))) (mcons -1 '()))) '())))
   '())))
For comparison, here's the results we got in exercise 2.56:
> (deriv (make-exponentiation 'a 5) 'a)
(* 5 (** a 4))
> (deriv (make-exponentiation 'a 'b) 'a)
(* b (** a (+ b -1)))
> (deriv (make-exponentiation 'a (make-sum 'a 'b)) 'a)
(* (+ a b) (** a (+ (+ a b) -1)))
Again, if you were to replace mcons with cons in the produced results and evaluate them then you would get the results we originally got.

(d) Inverted Indexing

For the final part of the exercise, the authors question what changes would be required if we were to index the procedures in the opposite way, replacing the dispatch line in deriv with:
((get (operator exp) 'deriv) (operands exp) var)
Now the naïve solution would be to simply modify all of our install-XXX-routines procedures above, swapping the 'deriv and operator symbol entries around as the operands to put, so that the operator symbol became the operation and 'deriv became the type. This would work. However, it doesn't quite capture the subtlety of the change that is made...

By swapping the order the authors have changed the types of packages that should be registered in our system. In the first three parts of the exercise a package encapsulates operations related to a particular arithmetic operation. We have a package for sums, a package for products and a package for exponents. Sure, each package only installs one procedure, a way of calculating derivatives involving that arithmetic operation, but this is the implied packaging. However, for the fourth part of the operation, a package should encapsulate a set of procedures for processing an arithmetic expression, and so we should have a single package that installs procedures that can be used to calculate the derivatives of an arithmetic expression that involves various arithmetic operations. In this vein we could also, for example, have a single package that defines how to evaluate installs procedures that can be used to evaluate an arithmetic expression, or reduce it to minimal terms.

So our solution here should have a single procedure that contains inner procedures for each supported arithmetic operation and registers all of those. Here's one possible implementation:
(define (install-derivative-routines)
  (define (sum ops var)
    (make-sum
     (deriv (car ops) var)
     (deriv (cadr ops) var)))
  (define (product ops var)
    (make-sum
     (make-product (car ops)
                   (deriv (cadr ops) var))
     (make-product (deriv (car ops) var)
                   (cadr ops))))
  (define (exponent ops var)
    (make-product
     (make-product (cadr ops)
                   (make-exponentiation (car ops)
                                        (make-sum (cadr ops) -1)))
     (deriv (car ops) var)))
  (put '+ 'deriv sum)
  (put '* 'deriv product)
  (put '** 'deriv exponent))
If we update our deriv procedure as described and install the derivative routines we can, of course, run all the tests again:
> (deriv '(+ x 3) 'x)
1
> (deriv '(* x y) 'x)
'y
> (deriv '(* (* x y) (+ x 3)) 'x)
(mcons
 '+
 (mcons
  (mcons '* (mcons 'x (mcons 'y '())))
  (mcons (mcons '* (mcons 'y (mcons (mcons '+ (mcons 'x (mcons 3 '()))) '()))) '())))
> (deriv (make-exponentiation 'a 5) 'a)
(mcons '* (mcons 5 (mcons (mcons '** (mcons 'a (mcons 4 '()))) '())))
> (deriv (make-exponentiation 'a 'b) 'a)
(mcons
 '*
 (mcons
  'b
  (mcons (mcons '** (mcons 'a (mcons (mcons '+ (mcons 'b (mcons -1 '()))) '()))) '())))
> (deriv (make-exponentiation 'a (make-sum 'a 'b)) 'a)
(mcons
 '*
 (mcons
  (mcons '+ (mcons 'a (mcons 'b '())))
  (mcons
   (mcons
    '**
    (mcons
     'a
     (mcons (mcons '+ (mcons (mcons '+ (mcons 'a (mcons 'b '()))) (mcons -1 '()))) '())))
   '())))
As you can see, these results match the mutable pair results we saw previously.

2011-12-23

SICP Exercise 2.72: Efficiency of Huffman Encoding

Consider the encoding procedure that you designed in exercise 2.68. What is the order of growth in the number of steps needed to encode a symbol? Be sure to include the number of steps needed to search the symbol list at each node encountered. To answer this question in general is difficult. Consider the special case where the relative frequencies of the n symbols are as described in exercise 2.71, and give the order of growth (as a function of n) of the number of steps needed to encode the most frequent and least frequent symbols in the alphabet.

Here's the encoding procedure, encode-symbol, we designed in exercise 2.68:
(define (encode-symbol symbol tree)
  (cond ((not (memq symbol (symbols tree)))
         (error "bad symbol -- ENCODE-SYMBOL" symbol))
        ((leaf? tree) '())
        ((memq symbol (symbols (left-branch tree)))
         (cons 0 (encode-symbol symbol (left-branch tree))))
        ((memq symbol (symbols (right-branch tree)))
         (cons 1 (encode-symbol symbol (right-branch tree))))))
And to recap, the summary of exercise 2.71 was that, for an alphabet of n symbols where relative frequencies of the symbols are 1, 2, 4, ..., 2n-1, a Huffman tree will encode the most frequent symbol using a 1-bit encoding, while the least frequent symbol will be encoded using n-1 bits. The reasoning for these bit lengths is that for such an alphabet generate-huffman-tree will produce a tree that is n levels deep in which each non-leaf node has a right child node that is a leaf. The most frequent symbol will be in the right child of the root node (and so only one edge needs to be traversed to get to the symbol) while the least frequent symbol will be in the nth level of the tree, in the left-most branch of the tree (and so n-1 edges need to be traversed to get to the symbol).

As with previous exercises we will assume that list operations (such as list, cons, car, cadr and caddr) are all Θ(1) operations. Since left-branch and right-branch simply delegate to car and cadr respectively this means we can assume they're Θ(1) too. If we also assume that eq? is Θ(1) then this allows us to assume that leaf? is Θ(1) too... and if leaf? is Θ(1) then that means symbols will also be Θ(1), as it performs a fixed number of operations, all of which are Θ(1).

Now, how about memq? Well, memq iterates through a list and compares each element to the specified symbol. If it matches then it returns the tail of the list starting at the first matching element, otherwise it returns #f. Now iterating through a list is Θ(n), and we've already stated that we're assuming list operations are Θ(1), so this will make memq an Θ(n) operation.

Okay, now we've stated all our assumptions about the orders of growth of the operations used by encode-symbol. Now we can go on to calculate the best case (when the symbol is the most frequent in the alphabet) and worst case (when the symbol is the least frequent in the alphabet) orders of growth.

Let's start with the best case: encoding the most frequent symbol. In this case the symbol's corresponding leaf node is the right child node of the root, and so the order of operations is:
  1. Start at the root of the tree.
  2. Ensure that symbol is in the list of symbols for the root - otherwise we'd throw an error. The check uses symbols and memq, and so this is Θ(n+1)=Θ(n).
  3. Check whether the root is a leaf node (which it's not) using leaf?, an Θ(1) operation.
  4. Check whether the symbol is under the left branch of the tree (which it's not). This uses left-branch, symbols and memq, and so this is Θ(1+n+1)=Θ(n).
  5. Check whether the symbol is under the right branch of the tree which it is). This uses right-branch, symbols and memq, and so is also Θ(1+n+1)=Θ(n).
  6. cons the value 1 onto the result of invoking encode-symbol on the right branch of the tree. cons is an Θ(1) operation. The recursive call works out at Θ(n+1+1)=Θ(n), using the following two steps:
    1. Ensure that symbol is in the list of symbols for the node (Θ(n+1)).
    2. Check whether the node is a leaf node, which it is, so return the empty list (Θ(1)).
Put that all together and you get an order of growth when the symbol is the most frequent in the alphabet of Θ(4n+2)=Θ(n).

Now let's move onto the worst case: encoding the least frequent symbol. In this case the symbol's corresponding leaf node is the left most child node at the lowest level of the tree. That means that at each non-leaf node we'll need to perform the following steps:
  1. Ensure that symbol is in the list of symbols for the root (Θ(n)).
  2. Check whether the root is a leaf node, which it's not (Θ(1)).
  3. Check whether the symbol is under the left branch of the tree, which it is (Θ(n)).
  4. cons the value 0 onto the result of invoking encode-symbol on the left branch of the tree (Θ(1), plus the cost of the recursive call).
So, if we ignore the the recursive call, for any non-leaf node we process the order of growth is Θ(2n+2)=Θ(n). Starting at the root there'll be n-1 non-leaf nodes we'll need to process before we reach the leaf node corresponding to the symbol we're looking for, so the order of growth of processing all of the non-leaf nodes will be Θ(n(n-1))=Θ(n2-n)=Θ(n2).

We just need to incorporate the order of growth of processing a leaf node into this. We already calculated this as Θ(n) in step 6 for the best case, so the total order of growth for encoding the least frequent symbol is Θ(n2+n)=Θ(n2).

So, to summarize, for an alphabet of n symbols where relative frequencies of the symbols are 1, 2, 4, ..., 2n-1, a Huffman tree will encode:
  • The most frequent symbol with an order of growth of Θ(n)
  • The least frequent symbol with an order of growth of Θ(n2)