2011-12-24

SICP Exercise 2.73: Data-Directed Derivatives

Section 2.3.2 described a program that performs symbolic differentiation:
(define (deriv exp var)
  (cond ((number? exp) 0)
        ((variable? exp) (if (same-variable? exp var) 1 0))
        ((sum? exp)
         (make-sum (deriv (addend exp) var)
                   (deriv (augend exp) var)))
        ((product? exp)
         (make-sum
           (make-product (multiplier exp)
                         (deriv (multiplicand exp) var))
           (make-product (deriv (multiplier exp) var)
                         (multiplicand exp))))
        (else (error "unknown expression type -- DERIV" exp))))
We can regard this program as performing a dispatch on the type of the expression to be differentiated. In this situation the "type tag" of the datum is the algebraic operator symbol (such as +) and the operation being performed is deriv. We can transform this program into data-directed style by rewriting the basic derivative procedure as
(define (deriv exp var)
   (cond ((number? exp) 0)
         ((variable? exp) (if (same-variable? exp var) 1 0))
         (else ((get 'deriv (operator exp)) (operands exp)
                                            var))))
(define (operator exp) (car exp))
(define (operands exp) (cdr exp))
  1. Explain what was done above. Why can't we assimilate the predicates number? and same-variable? into the data-directed dispatch?
  2. Write the procedures for derivatives of sums and products, and the auxiliary code required to install them in the table used by the program above.
  3. Choose any additional differentiation rule that you like, such as the one for exponents (exercise 2.56), and install it in this data-directed system.
  4. In this simple algebraic manipulator the type of an expression is the algebraic operator that binds it together. Suppose, however, we indexed the procedures in the opposite way, so that the dispatch line in deriv looked like
    ((get (operator exp) 'deriv) (operands exp) var)
    
    What corresponding changes to the derivative system are required?
Before we get started let's give ourselves the get and put procedures we need... As the book notes, we discuss the construction of such procedures in section 3.3.3. To save you looking ahead, here's the appropriate code you'll need if you're following along... I'll note (yet again) that I'm using DrRacket as my scheme interpreter - in order to get this to fly I had to include a couple of packages:
#lang racket

(require rnrs/base-6)
(require rnrs/mutable-pairs-6)

(define (assoc key records)
  (cond ((null? records) false)
        ((equal? key (caar records)) (car records))
        (else (assoc key (cdr records)))))

(define (make-table)
  (let ((local-table (list '*table*)))
    (define (lookup key-1 key-2)
      (let ((subtable (assoc key-1 (cdr local-table))))
        (if subtable
            (let ((record (assoc key-2 (cdr subtable))))
              (if record
                  (cdr record)
                  false))
            false)))
    (define (insert! key-1 key-2 value)
      (let ((subtable (assoc key-1 (cdr local-table))))
        (if subtable
            (let ((record (assoc key-2 (cdr subtable))))
              (if record
                  (set-cdr! record value)
                  (set-cdr! subtable
                            (cons (cons key-2 value)
                                  (cdr subtable)))))
            (set-cdr! local-table
                      (cons (list key-1
                                  (cons key-2 value))
                            (cdr local-table)))))
      'ok)    
    (define (dispatch m)
      (cond ((eq? m 'lookup-proc) lookup)
            ((eq? m 'insert-proc!) insert!)
            (else (error "Unknown operation -- TABLE" m))))
    dispatch))

(define operation-table (make-table))
(define get (operation-table 'lookup-proc))
(define put (operation-table 'insert-proc!))

(a) What Was Done

In the original version the type of an expression was tested through various predicates. Two of these, sum? and product?, relied upon the prefix notation used for the expressions to extract the car of the expression and compare it to the symbol corresponding to the operation. In order to move to a data-directed style of implementation a new procedure, operator, has been introduced to extract the symbol corresponding to the current operation. A corresponding procedure, operands, has also been introduced that extracts the operands for the operation. This also makes use of the fact that expressions defined using prefix notation, simply returning the cdr of the expression.

The former of these two procedures is used to obtain the type tag (i.e. '+ or '*) to use with get to retrieve the appropriate 'deriv procedure to apply. The latter procedure is then used to obtain the operands to pass to the obtained procedure, and the obtained procedure is invoked.

This allows the deriv procedure to support differentiation of any number of different operations to be supported without requiring continual updates to the procedure - provided, of course, that the operations are expressed using prefix notation. Adding a new operation is simply a case of registering the corresponding procedure under the 'deriv operation with the appropriate symbol corresponding to the type.

Unfortunately, number? and variable? cannot be handled with the implementation as it stands, as these are not expressed as a list with an operator as their first element and operands as their tail. As a result the implementations of operator and operands will not be able to cope with them - the interpreter will raise an error if they are applied to numbers or variables. This prevents us from registering appropriate procedures to cope with them (and so, by proxy, prevents us from encapsulating same-variable? in a procedure and registering that).

Note that we could provide a work-around for this by extending operator and operands appropriately. We could extend operator so that it checks whether or not an expression is a number or variable before returning the car of the expression. If it's either of the two then we could return proxy operations (e.g. 'number and 'variable) as appropriate. This would allow us to register procedures. Similarly, we could extend operator so that it checks whether it's dealing with a list or not before returning the cdr of the expression. If it's not a list then we can simply return a single element list. This would lead to the following implementation:
(define (deriv exp var)
  ((get 'deriv (operator exp)) (operands exp) var))

(define (operator exp)
  (cond ((number? exp) 'number)
        ((variable? exp) 'variable)
        (else (car exp))))

(define (operands exp)
  (if (pair? exp)
      (cdr exp)
      (list exp)))
Following the packaging mechanism introduced in section 2.4.3 we could then install the appropriate procedures as follows:
(define (install-number-routines)
  (define (derivative ops var) 0)
  (put 'deriv 'number derivative))
    
(define (install-variable-routines)
  (define (derivative ops var)
    (if (same-variable? (car ops) var) 1 0))
  (put 'deriv 'variable derivative))

(install-number-routines)
(install-variable-routines)
We won't use this though - we'll stick to the exercise!

(b) Data-Directed Derivatives of Sums and Products

The procedures for derivatives of sums and products correspond very closely to the appropriate consequent expressions in their corresponding clauses in the original implementations. However, note that the implementations of addend, augend, multiplier and multiplicand provided in the book will not work for us here as they assume that they are operating on a complete expression in prefix form (i.e. a list of 3 elements with the operator as the first element and the operands as the second and third elements), whereas the procedures will be dealing with a list containing just the operands.

Rather than redefine the selectors, we'll simply process the operands directly. We can access the first operand using car, and the second operand using cadr. So we simply replace invocations of addend and multiplier on the expression with caring the operands and, similarly, replace invocations of augend and multiplicand on the expression with cadring the operands.

We then need to install the procedures into the table used by the program. We saw an example of doing this above, where we saw how we could deal with number? and variable? by abusing the procedures operator and operands... You may have noticed above that the inner procedures that were actually installed above were both called derivative rather than deriv. There was a reason for that, and it's here that we can see the reason...

The calculation of derivatives for both sums and products requires recursive calls to the deriv procedure defined above. If we don't rename the inner procedures that calculate the derivatives for sums and products to something other than deriv then the in-scope association for deriv will be to the inner procedure being defined, and so the inner procedures' recursive calls will be to themselves. To avoid this (which would cause the calls to ultimately fail) we must rename the inner procedures to something other than deriv. I chose to go with the expanded name, derivative, and applied this name change to the procedures for dealing with numbers and variables too in order to maintain consistency among the packages.

Anyway, here are the packages for sums and products:
(define (install-sum-routines)
  (define (derivative ops var)
    (make-sum
     (deriv (car ops) var)
     (deriv (cadr ops) var)))
  (put 'deriv '+ derivative))

(define (install-product-routines)
  (define (derivative ops var)
    (make-sum
     (make-product (car ops)
                   (deriv (cadr ops) var))
     (make-product (deriv (car ops) var)
                   (cadr ops))))
  (put 'deriv '* derivative))

(install-sum-routines)
(install-product-routines)
As usual, we'll test them out. In this case I got a somewhat unexpected result for the third one:
> (deriv '(+ x 3) 'x)
1
> (deriv '(* x y) 'x)
'y
> (deriv '(* (* x y) (+ x 3)) 'x)
(mcons
 '+
 (mcons
  (mcons '* (mcons 'x (mcons 'y '())))
  (mcons (mcons '* (mcons 'y (mcons (mcons '+ (mcons 'x (mcons 3 '()))) '()))) '())))
If you recall from the book, the result for calculating the derivative of the last expression should be as follows:
> (deriv '(* (* x y) (+ x 3)) 'x)
'(+ (* x y) (* y (+ x 3)))
Now if you look closely at the result we actually get then you'll notice that there is some similarity to the expected result. In fact, if you were to replace mcons with cons in the produced result and evaluate it then you would get the expected result.

So what's going on? Well, the R6RS specification banished procedures such as set-cdr! to their own library, as these can give rise to mutable pairs. LISP and Scheme are often referred to as functional programming languages and it's my understanding that this change in R6RS is an effort to isolate procedures that have side-effects from the standard language, as these are not true functions. It's also my understanding that including this library under DrRacket effectively results in replacing the normal (immutable pair) implementations of cons, car, etc. with their mutable pair equivalents, which includes replacing cons with mcons, and that's what we're seeing here.

If you want more details on this then you can read your way through the DrRacket team's original post where they discuss these changes. Googling for "mutable pair racket" will also throw up a bunch of results that will give you further info.

(c) Data-Directed Derivatives of Exponents

In exercise 2.56 we extended the original deriv procedure to support exponents by using '** as the operator representing an exponent and adding in the following clause into the cond:
        …
        ((exponentiation? exp)
         (make-product
           (make-product (exponent exp)
                         (make-exponentiation (base exp)
                                              (make-sum (exponent exp) -1)))
           (deriv (base exp) var)))
        …
Converting this to the data-directed version is again straightforward. The two operands to the operator are the base and exponent, in that order, so we can use car and cadr to extract these in an analogous manner to sums and products and so convert the consequent expression into a derivative procedure. All that remains is to install it into the table for the 'deriv operation under the '** type. This gives the following implementation:
(define (install-exponent-routines)
  (define (derivative ops var)
    (make-product
     (make-product (cadr ops)
                   (make-exponentiation (car ops)
                                        (make-sum (cadr ops) -1)))
     (deriv (car ops) var)))
  (put 'deriv '** derivative))
After installing this we can - in order to get this to fly I st examples as we used in exercise 2.56:
> (deriv (make-exponentiation 'a 5) 'a)
(mcons '* (mcons 5 (mcons (mcons '** (mcons 'a (mcons 4 '()))) '())))
> (deriv (make-exponentiation 'a 'b) 'a)
(mcons
 '*
 (mcons
  'b
  (mcons (mcons '** (mcons 'a (mcons (mcons '+ (mcons 'b (mcons -1 '()))) '()))) '())))
> (deriv (make-exponentiation 'a (make-sum 'a 'b)) 'a)
(mcons
 '*
 (mcons
  (mcons '+ (mcons 'a (mcons 'b '())))
  (mcons
   (mcons
    '**
    (mcons
     'a
     (mcons (mcons '+ (mcons (mcons '+ (mcons 'a (mcons 'b '()))) (mcons -1 '()))) '())))
   '())))
For comparison, here's the results we got in exercise 2.56:
> (deriv (make-exponentiation 'a 5) 'a)
(* 5 (** a 4))
> (deriv (make-exponentiation 'a 'b) 'a)
(* b (** a (+ b -1)))
> (deriv (make-exponentiation 'a (make-sum 'a 'b)) 'a)
(* (+ a b) (** a (+ (+ a b) -1)))
Again, if you were to replace mcons with cons in the produced results and evaluate them then you would get the results we originally got.

(d) Inverted Indexing

For the final part of the exercise, the authors question what changes would be required if we were to index the procedures in the opposite way, replacing the dispatch line in deriv with:
((get (operator exp) 'deriv) (operands exp) var)
Now the naïve solution would be to simply modify all of our install-XXX-routines procedures above, swapping the 'deriv and operator symbol entries around as the operands to put, so that the operator symbol became the operation and 'deriv became the type. This would work. However, it doesn't quite capture the subtlety of the change that is made...

By swapping the order the authors have changed the types of packages that should be registered in our system. In the first three parts of the exercise a package encapsulates operations related to a particular arithmetic operation. We have a package for sums, a package for products and a package for exponents. Sure, each package only installs one procedure, a way of calculating derivatives involving that arithmetic operation, but this is the implied packaging. However, for the fourth part of the operation, a package should encapsulate a set of procedures for processing an arithmetic expression, and so we should have a single package that installs procedures that can be used to calculate the derivatives of an arithmetic expression that involves various arithmetic operations. In this vein we could also, for example, have a single package that defines how to evaluate installs procedures that can be used to evaluate an arithmetic expression, or reduce it to minimal terms.

So our solution here should have a single procedure that contains inner procedures for each supported arithmetic operation and registers all of those. Here's one possible implementation:
(define (install-derivative-routines)
  (define (sum ops var)
    (make-sum
     (deriv (car ops) var)
     (deriv (cadr ops) var)))
  (define (product ops var)
    (make-sum
     (make-product (car ops)
                   (deriv (cadr ops) var))
     (make-product (deriv (car ops) var)
                   (cadr ops))))
  (define (exponent ops var)
    (make-product
     (make-product (cadr ops)
                   (make-exponentiation (car ops)
                                        (make-sum (cadr ops) -1)))
     (deriv (car ops) var)))
  (put '+ 'deriv sum)
  (put '* 'deriv product)
  (put '** 'deriv exponent))
If we update our deriv procedure as described and install the derivative routines we can, of course, run all the tests again:
> (deriv '(+ x 3) 'x)
1
> (deriv '(* x y) 'x)
'y
> (deriv '(* (* x y) (+ x 3)) 'x)
(mcons
 '+
 (mcons
  (mcons '* (mcons 'x (mcons 'y '())))
  (mcons (mcons '* (mcons 'y (mcons (mcons '+ (mcons 'x (mcons 3 '()))) '()))) '())))
> (deriv (make-exponentiation 'a 5) 'a)
(mcons '* (mcons 5 (mcons (mcons '** (mcons 'a (mcons 4 '()))) '())))
> (deriv (make-exponentiation 'a 'b) 'a)
(mcons
 '*
 (mcons
  'b
  (mcons (mcons '** (mcons 'a (mcons (mcons '+ (mcons 'b (mcons -1 '()))) '()))) '())))
> (deriv (make-exponentiation 'a (make-sum 'a 'b)) 'a)
(mcons
 '*
 (mcons
  (mcons '+ (mcons 'a (mcons 'b '())))
  (mcons
   (mcons
    '**
    (mcons
     'a
     (mcons (mcons '+ (mcons (mcons '+ (mcons 'a (mcons 'b '()))) (mcons -1 '()))) '())))
   '())))
As you can see, these results match the mutable pair results we saw previously.

No comments:

Post a Comment