2011-09-25

SICP Exercise 2.12: Tolerating Engineers

After debugging her program, Alyssa shows it to a potential user, who complains that her program solves the wrong problem. He wants a program that can deal with numbers represented as a center value and an additive tolerance; for example, he wants to work with intervals such as 3.5± 0.15 rather than [3.35, 3.65]. Alyssa returns to her desk and fixes this problem by supplying an alternate constructor and alternate selectors:
(define (make-center-width c w)
  (make-interval (- c w) (+ c w)))
(define (center i)
  (/ (+ (lower-bound i) (upper-bound i)) 2))
(define (width i)
  (/ (- (upper-bound i) (lower-bound i)) 2))
Unfortunately, most of Alyssa's users are engineers. Real engineering situations usually involve measurements with only a small uncertainty, measured as the ratio of the width of the interval to the midpoint of the interval. Engineers usually specify percentage tolerances on the parameters of devices, as in the resistor specifications given earlier.

Define a constructor make-center-percent that takes a center and a percentage tolerance and produces the desired interval. You must also define a selector percent that produces the percentage tolerance for a given interval. The center selector is the same as the one shown above.

The percentage tolerance is another way of specifying the width of an interval, so our constructor will share some commonality with make-center-width. It will take two parameters, the first of which is the center value of the interval. It will also call make-interval with two bounds calculated by respectively subtracting and adding the width of the interval from the center value. Where it will differ will be in the second operand and what we do with it. The second parameter will express the width of the interval as a percentage of the center value, rather than as an absolute value. We will then have to use this to calculate the width of the interval so that we can use that to calculate the lower and upper bounds of the interval. This is simply a case of multiplying the center value by the percentage value and then dividing by 100.

Here's our constructor:
(define (make-center-percent c p)
  (let ((w (/ (* c p) 100)))
    (make-interval (- c w) (+ c w))))
To define the selector percent we need to find the width of the interval and then determine what percentage of the center value the width is. I.e. reverse the calculation we used above to produce the width from the percentage tolerance. Given that we have the center and width selectors Alyssa defined above we can define our selector simply as:
(define (percent i)
  (/ (* (width i) 100) (center i)))
Alternatively, if we express this as an algebraic expression it's easy to see how we could reduce the operations in the calculation. If we express the lower bound of the interval as li and the upper bound as ui then the calculation of (percent i) is equivalent to:
  100 × (ui - li)/2
     (li + ui)/2
= 100 × (ui - li)/2
        (li + ui)/2
= 100 × ui - li
        li + ui
So we could also define percent as:
(define (percent i)
  (* 100
     (/ (- (upper-bound i) (lower-bound i))
        (+ (upper-bound i) (lower-bound i)))))
This removes the divide-by-two that occurs in the calculations of both the width and the center values, only to have the division of width by center values to cancel out the divide-by-twos.

Let's test it out to confirm it works:
> (define cp1 (make-center-percent 100 10))
> (define cp2 (make-center-percent 50 25))
> (center cp1)
100
> (center cp2)
50
> (percent cp1)
10
> (percent cp2)
25

SICP Exercise 2.11: Cryptic Multiplication

In passing, Ben also cryptically comments: "By testing the signs of the endpoints of the intervals, it is possible to break mul-interval into nine cases, only one of which requires more than two multiplications."' Rewrite this procedure using Ben's suggestion.

Okay, so here Ben is suggesting that we don't need to calculate all of {p1, p2, p3, p4} in order to calculate mul-interval. Instead by looking at the signs of the lower and upper bounds of the two intervals it is possible, in the majority of cases, to know which two of {p1, p2, p3, p4} to use as the lower and upper bounds and so only need to calculate those. However, before we can produce such a procedure we need to determine the mapping from signs of bounds of intervals to pairs.

Before we get into that though, in order to simplify the discussion here I'm going to introduce a shorthand for refering to lower and upper bounds of intervals: for any interval x I'm going to use lx to represent the lower bound of the interval and ux to represent the upper bound. I'll also note that the set of values {p1, p2, p3, p4} corresponds to the set of products {lxly, lxuy, uxly, uxuy}. I'm going to use the latter in this discussion as we will no longer calculate the former set up front in the final procedure.

Now, back to determining the mapping from signs of bounds of intervals to pairs... In order to do this we must first consider not just the signs of the bounds but also the relationship between their absolute values. We can easily show this to be important if we consider multiplying a pair of intervals where the bounds are all non-negative (non-negative intervals) and then compare this to multiplying a pair of intervals where the bounds are all negative (negative intervals):
  • In the former case we know that lx and ly are the two smallest values from the two intervals and ux and uy are the two largest values from the two intervals. As a result we can deduce that lxly will be the minimum of the four values in the set, while uxuy will be the maximum.
  • Looking at the latter case we can first observe that all four values in the set will be positive (as each value is the result of multiplying two negative numbers). However, looking at the absolute values of the bounds we can see that lx and ly have the two largest absolute values and ux and uy have the two smallest absolute values. As a result we can deduce that uxuy will be the minimum of the four values in the set, while lxly will be the maximum.
We can also use the absolute values to deduce what happens when one interval, let's say x, is a non-negative interval and the other interval, y, is negative. In this case all of the values in the resulting set will be negative as each value is the product of a (non-negative) bound of x and a (negative) bound of y. As all the values in the set are negative, the lower bound of the resulting interval will be the value with the largest absolute value, while the upper bound will be the value with the smallest absolute value. So to produce the lower bound we need to determine the bounds which have the largest absolute values and multiply those together. Similarly, to produce the upper bound we need to determine the bounds which have the smallest absolute values and multiply those together. As x is a non-negative interval, the bound with the largest absolute value is ux, while the bound with the smallest is lx. However, as y is a negative interval the converse is true: the bound with the largest absolute value is ly, while the bound with the smallest is uy. Therefore the lower bound for the interval produced by multiplying x and y will be uxly and the upper bound will be lxuy.

When we have a mixture of types of intervals things become slightly more complex, but there's an additional technique we can add to considering the signs and absolute values. Note that it is always possible to split up the set {lxly, lxuy, uxly, uxuy} into two subsets: one containing all the negative values and the other containing all the non-negative values. When there are a mixture of interval types (e.g. one is a negative interval, the other is a non-negative interval, or one is a non-negative interval and the other is a zero-spanning interval (where the lower bound is negative and the upper bound is non-negative)) we are guaranteed that both subsets will be non-empty. In such cases the lower bound will then be the value with the largest absolute value from the set of negative values, while the upper bound will be the value with the largest absolute value from the set of positive values.

For example, let's consider the case where the interval x spans zero and interval y is non-negative:
  • In this case we can see that any value in the set that is a multuple of lx will be negative, while any value in the set that is a multuple of ux will be positive. This gives us our two subsets:
    • {lxly, lxuy} contains the negative values.
    • {uxly, uxuy} contains the non-negative values.
  • To determine the lower bound we need to find the value with the largest absolute value from the set of negative values. The only difference between the two values is the y bound, so we just need to determine which of the two bounds of y has the largest absolute value. Now as y is a non-negative interval this will be its upper bound, uy. As a result, the lower bound for the interval produced by multiplying x and y will be lxuy.
  • To determine the upper bound we need to find the value with the largest absolute value from the set of positive values. Again the only difference between the two values is the y bound, so we just need to determine which of the two bounds of y has the largest absolute value, and we already know this to be uy. As a result, the upper bound for the interval produced by multiplying x and y will be uxuy.
We can continue this for all of the different combinations of negative, non-negative and zero-spanning intervals.

However, as Ben hinted, there's one case that we can't determine in this manner. It's the case when we have two zero-spanning intervals. In this case our two subsets are:
  • {lxuy, uxly} for the negative values.
  • {lxly, uxuy} for the non-negative values.
Unfortunately it's not possible to determine which member of each subset has the largest absolute value without examining the values. Why is this? Well, as both intervals are zero-spanning it's not possible to say anything about the relationship between the absolute values of their bounds. I.e. for a given zero-spanning interval i, we don't know whether it's the case that |li|≤|ui| or it's the case that |li|≥|ui|. As a result, when we multiply one bound from x with one bound from y and multiply the other bound from x with the other bound from y we won't be able to tell which will provide the larger or smaller value without actually comparing the values. So in this case we'll need to calculate all four values and then perform the appropriate comparisons to determine our bounds.

Okay, so let's work through each of the valid combinations of signs for the lower and upper bounds of x and y and determine how to calculate the lower and upper values for the result of multiplying x and y. I say "valid" as we need to remember that the lower bound is always less than or equal to the upper bound, so we'll never encounter an interval where the upper bound is negative, while the lower bound is non-negative. N.B. In the following table I'm using -ve to indicate a negative value and +ve to represent a non-negative value.

lx ux ly uy lxly lxuy uxly uxuy min max
+ve +ve +ve +ve +ve +ve +ve +ve lxly uxuy
+ve +ve -ve +ve -ve +ve -ve +ve uxly uxuy
+ve +ve -ve -ve -ve -ve -ve -ve uxly lxuy
-ve +ve +ve +ve -ve -ve +ve +ve lxuy uxuy
-ve +ve -ve +ve +ve -ve -ve +ve min(lxuy,uxly) max(lxly,uxuy)
-ve +ve -ve -ve +ve +ve -ve -ve uxly lxly
-ve -ve +ve +ve -ve -ve -ve -ve lxuy uxly
-ve -ve -ve +ve +ve -ve +ve -ve lxuy lxly
-ve -ve -ve -ve +ve +ve +ve +ve uxuy lxly

Now we now know how to calculate the lower and upper bounds, here's my procedure for doing so:
(define (mul-interval x y)
  (define not-negative? (lambda (x) (not (negative? x))))
  (let ((lx (lower-bound x))
        (ux (upper-bound x))
        (ly (lower-bound y))
        (uy (upper-bound y)))
    (cond ((and (not-negative? lx)
                (not-negative? ux)
                (not-negative? ly)
                (not-negative? uy))
             (make-interval (* lx ly) (* ux uy)))
          ((and (not-negative? lx)
                (not-negative? ux)
                (negative? ly)
                (not-negative? uy))
             (make-interval (* ux ly) (* ux uy)))
          ((and (not-negative? lx)
                (not-negative? ux)
                (negative? ly)
                (negative? uy))
             (make-interval (* ux ly) (* lx uy)))
          ((and (negative? lx)
                (not-negative? ux)
                (not-negative? ly)
                (not-negative? uy))
             (make-interval (* lx uy) (* ux uy)))
          ((and (negative? lx)
                (not-negative? ux)
                (negative? ly)
                (not-negative? uy))
             (make-interval (min (* lx uy) (* ux ly)) (max (* lx ly) (* ux uy))))
          ((and (negative? lx)
                (not-negative? ux)
                (negative? ly)
                (negative? uy))
             (make-interval (* ux ly) (* lx ly)))
          ((and (negative? lx)
                (negative? ux)
                (not-negative? ly)
                (not-negative? uy))
             (make-interval (* lx uy) (* ux ly)))
          ((and (negative? lx)
                (negative? ux)
                (negative? ly)
                (not-negative? uy))
             (make-interval (* lx uy) (* lx ly)))
          ((and (negative? lx)
                (negative? ux)
                (negative? ly)
                (negative? uy))
             (make-interval (* ux uy) (* lx ly)))
          (else (error "Invalid Combination!")))))
Finally, we can see this in action:
> (mul-interval (make-interval 1 2) (make-interval 2 3))
'(2 . 6)
> (mul-interval (make-interval 1 2) (make-interval -2 3))
'(-4 . 6)
> (mul-interval (make-interval 1 2) (make-interval -2 -1))
'(-4 . -1)
> (mul-interval (make-interval -1 2) (make-interval 2 3))
'(-3 . 6)
> (mul-interval (make-interval -1 2) (make-interval -2 3))
'(-4 . 6)
> (mul-interval (make-interval -1 2) (make-interval -2 -1))
'(-4 . 2)
> (mul-interval (make-interval -2 -1) (make-interval 2 3))
'(-6 . -2)
> (mul-interval (make-interval -2 -1) (make-interval -2 3))
'(-6 . 4)
> (mul-interval (make-interval -2 -1) (make-interval -2 -1))
'(1 . 4)

2011-09-24

SICP Exercise 2.10: Divide by Zero

Ben Bitdiddle, an expert systems programmer, looks over Alyssa's shoulder and comments that it is not clear what it means to divide by an interval that spans zero. Modify Alyssa's code to check for this condition and to signal an error if it occurs.

I must confess that, being the old AI graduate that I am, I read this and thought "expert systems? what on earth have they got to do with this exercise?" Nothing, of course - and this nicely illustrates why natural language understanding is difficult. Anyway...

How can we tell if an interval spans zero? Well, the lower bound of the number will be zero or less, while the upper bound of the number will be zero or more. Here's my modified procedure:
(define (div-interval x y)
  (if (and (<= (lower-bound y) 0) (>= (upper-bound y) 0))
      (error "Divisor spans zero")
      (mul-interval x
                    (make-interval (/ 1.0 (upper-bound y))
                                   (/ 1.0 (lower-bound y))))))
And here it is in action:
> (div-interval (make-interval 3 9) (make-interval 1 5))
'(0.6000000000000001 . 9.0)
> (div-interval (make-interval 3 9) (make-interval -4 -1))
'(-9.0 . -0.75)
> (div-interval (make-interval 3 9) (make-interval -4 0))
. . Divisor spans zero
> (div-interval (make-interval 3 9) (make-interval 0 5))
. . Divisor spans zero
> (div-interval (make-interval 3 9) (make-interval -4 5))
. . Divisor spans zero