2011-08-31

SICP Exercise 1.13: Fibonacci Numbers


Prove that Fib(n) is the closest integer to ϕn/√5, where ϕ = (1 + √5)/2. Hint: Let ψ = (1 - √5)/2. Use induction and the definition of the Fibonacci numbers (see section 1.2.2) to prove that Fib(n) = (ϕn - ψn)/√5.

The definition of Fib(n) given in section 1.2.2 is:
Fib(n) =0if n = 0
1if n = 1
Fib(n - 1) + Fib(n - 2)otherwise
Abelson & Sussman have given us a hint here. They suggest proving that Fib(n) = (ϕn - ψn)/√5. Let's start with proving that and then move on to figuring out why the hint is correct.

First, let's prove the base cases of n = 0 and n = 1. When n = 0:
0 - ψ0)/√5
= (1 - 1)/√5
= 0/√5
= 0
= Fib(0)
and when n = 1:
1 - ψ1)/√5
= (((1 + √5)/2)1 - ((1 - √5)/2)1)/√5
= ((1 + √5)/2 - (1 - √5)/2)/√5
= ((1 + √5 - 1 + √5)/2)/√5
= 2√5/2/√5
= √5/√5
= 1
= Fib(1)
To then prove by induction that their hint holds for all n we then first make the assumptions that the following two formulae hold:
  Fib(n - 1) = (ϕn-1 - ψn-1)/√5
  Fib(n)     = (ϕn - ψn)/√5
Next we try to prove that, given these assumptions, the formula also holds for n + 1. I.e. we need to prove that Fib(n + 1) = (ϕn+1 - ψn+1)/√5. Before we do this, a little side track... We already know that ϕ2 = ϕ + 1. We can also prove the corresponding equation of ψ2 = ψ + 1:
  ψ2
= ((1 - √5)/2)2
= (1 - √5)/2)(1 - √5)/2)
= (1 - √5)(1 - √5)/4
= (1 - 2√5 + 5)/4
= (6 - 2√5)/4
= (3 - √5)/2
= (1 - √5)/2 + 1
= ψ + 1
Okay, having done that, back to our proof for n + 1:
  Fib(n + 1)
= Fib(n) + Fib(n - 1)
= (ϕn - ψn)/√5 + (ϕn-1 - ψn-1)/√5
= (ϕn - ψn + ϕn-1 - ψn-1)/√5
= ((ϕn + ϕn-1) - (ψn + ψn-1))/√5
= ((ϕ + 1)ϕn-1 - (ψ + 1)ψn-1)/√5
= (ϕ2ϕn-1 - ψ2ψn-1)/√5
= (ϕn+1 - ψn+1)/√5

So we've proved by induction that Fib(n) = (ϕn - ψn)/√5. But why does that prove that Fib(n) is the closest integer to ϕn/√5?

Well if Fib(n) is the closest integer to ϕn/√5 then ϕn/√5 must be within 1/2 of Fib(n). In other words, we need to prove that:
|(Fib(n) - ϕn/√5)| ≤ 1/2
Recall that we've already shown that Fib(n) = (ϕn - ψn)/√5, so we can proceed as follows:
  |(Fib(n) - ϕn/√5)| ≤ 1/2
⇒ |((ϕn - ψn)/√5 - ϕn/√5)| ≤ 1/2
⇒ |(-ψn/√5)| ≤ 1/2
⇒ |ψn/√5| ≤ 1/2
⇒ |ψn| ≤ √5/2
So we just need to to prove that |ψn| ≤ √5/2:
  • We know that, for any value of x such that -1 ≤ x ≤ 1 and any value of n ≥ 1, it's the case that 0 ≤ |(xn)| ≤ 1
  • ψ = (1 - √5)/2 ≈ -0.6180, so 0 ≤ |ψn| ≤ 1
  • √5/2 ≈ 1.1180
So |ψn| ≤ √5/2, and so Fib(n) produces the closest integer to ϕn/√5

SICP Exercise 1.12: Pascal's Triangle

The following pattern of numbers is called Pascal's triangle.
    1
   1 1
  1 2 1
 1 3 3 1
1 4 6 4 1
The numbers at the edge of the triangle are all 1, and each number inside the triangle is the sum of the two numbers above it. Write a procedure that computes elements of Pascal's triangle by means of a recursive process.

As we've not been introduced to lists yet we can't write a procedure to generate rows iteratively, but we can write one that will return a single element from the triangle corresponding to a particular row and column. Just to be different, I'm going to use one-based indexing for referencing the rows and columns in the triangle. This means:
  • Row 1 is the row at the top of the triangle. This has a single column containing the value 1.
  • Column 1 is the left-most number in a row. Note that column 1 has the value 1 in all rows. This means that if the value in column 1 is requested we can simply return 1 without any further calculation.
  • Column n is the nth number (or right-most number) in row n. This also has the value 1 in all rows and similarly means that if the value in column n is requested we can simply return 1 without any further calculation.

What else can we say before we start? Well, the value for a particular row and column is only defined for row values of 1 or more, and for column values between 1 and the row number we're trying to calculate it for. The error operator hasn't been introduced at this point in SICP, so we'll need to do something else to indicate when we get bad input values. Let's return #f for now.

Okay, given all that, here's the procedure:
(define (pascal-tri row column)
  (cond ((or (< row 1) (< column 1) (> column row)) #f)
        ((or (= column 1) (= column row)) 1)
        (else (+ (pascal-tri (- row 1) (- column 1))
                 (pascal-tri (- row 1) column)))))

Let's test it! We know that row 5 has the column values {1, 4, 6, 4 1}, so let's try that:
> (pascal-tri 5 0)
#f
> (pascal-tri 5 1)
1
> (pascal-tri 5 2)
4
> (pascal-tri 5 3)
6
> (pascal-tri 5 4)
4
> (pascal-tri 5 5)
1
> (pascal-tri 5 6)
#f

SICP Exercise 1.11: Recursive and Iterative Functions

A function f is defined by the rule that f(n) = n if n < 3 and f(n) = f(n - 1) + 2f(n - 2) + 3f(n - 3) if n ≥ 3. Write a procedure that computes f by means of a recursive process. Write a procedure that computes f by means of an iterative process.

The description implies that we have two cases to deal with, depending upon whether n is less than 3 or not. For both the recursive and iterative procedures we can achieve this with an if conditional to separate the two cases, and have the consequent to return n. However it's in the alternative where the two procedures will differ.

For the recursive procedure, the alternative is achieved via a direct translation of the function for n ≥ 3, performing the appropriate recursive calls of f and combining the results as defined:
(define (f n)
  (if (< n 3)
      n
      (+ (f (- n 1))
         (* 2 (f (- n 2)))
         (* 3 (f (- n 3))))))

Let's look at the results of this:
> (f 0)
0
> (f 1)
1
> (f 2)
2
> (f 3)
4
> (f 4)
11
> (f 5)
25
> (f 6)
59

For the iterative approach we need to accumulate the result as we go. To do this we can observe that, for any value of n ≥ 3, we need to know the result of (f n) for the preceding 3 values of n. We know what these are for n = 3: {0, 1, 2}. So we can start with that as our base case. Then all we need to do is, for each increasing value of n check to see if we've reached our target yet. If so then the third value in this set is the result; if not then we can calculate the next value in the sequence by plugging these three values into the formula, generate the next set by dropping the first item, and appending the newly calculated value as the new third item and loop again.
(define (f n)
  (define (iter a b c count)
    (if (= count n)
        c
        (iter b c (+ c (* 2 b) (* 3 a)) (+ count 1))))
  (if (< n 3)
      n
      (iter 0 1 2 2)))
Notice how this implementation will always perform at least two calls to iter:
  • Evaluating (f 3) requires evaluating (iter 0 1 2 2).
  • The predicate for (iter 0 1 2 2) evaluates to false.
  • This causes the alternative to be evaluated, and so (iter 1 2 4 3) is evaluated.
  • This time the predicate evaluates to true and the result is returned.
Given that we only ever try to evaluate iter if n ≥ 3, we could remove the first of these iterations by starting the iteration with the appropriate values for n = 3. I.e. {1, 2, 4}:
(define (f n)
  (define (iter a b c count)
    (if (= count n)
        c
        (iter b c (+ c (* 2 b) (* 3 a)) (+ count 1))))
  (if (< n 3)
      n
      (iter 1 2 4 3)))
This gives the same output as before - I did check, honest! I'll not bother repeating it though.

2011-08-30

SICP Exercise 1.10: Ackermann's Function

The following procedure computes a mathematical function called Ackermann's function.
(define (A x y)
  (cond ((= y 0) 0)
        ((= x 0) (* 2 y))
        ((= y 1) 2)
        (else (A (- x 1)
                 (A x (- y 1))))))
What are the values of the following expressions?
(A 1 10)

(A 2 4)

(A 3 3)
Consider the following procedures, where A is the procedure defined above:
(define (f n) (A 0 n))

(define (g n) (A 1 n))

(define (h n) (A 2 n))

(define (k n) (* 5 n n))
Give concise mathematical definitions for the functions computed by the procedures f, g, and h for positive integer values of n. For example, (k n) computes 5n2.

Ah, more substitution! What joy!
  (A 1 10)
= (A (- 1 1) (A 1 (- 10 1)))
= (A 0 (A 1 9))
= (A 0 (A (- 1 1) (A 1 (- 9 1))))
= (A 0 (A 0 (A 1 8)))
= (A 0 (A 0 (A (- 1 1) (A 1 (- 8 1)))))
= (A 0 (A 0 (A 0 (A 1 7))))
= (A 0 (A 0 (A 0 (A (- 1 1) (A 1 (- 7 1))))))
= (A 0 (A 0 (A 0 (A 0 (A 1 6)))))
= (A 0 (A 0 (A 0 (A 0 (A (- 1 1) (A 1 (- 6 1)))))))
= (A 0 (A 0 (A 0 (A 0 (A 0 (A 1 5))))))
= (A 0 (A 0 (A 0 (A 0 (A 0 (A (- 1 1) (A 1 (- 5 1))))))))
= (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 1 4)))))))
= (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A (- 1 1) (A 1 (- 4 1)))))))))
= (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 1 3))))))))
= (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A (- 1 1) (A 1 (- 3 1))))))))))
= (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 1 2)))))))))
= (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A (- 1 1) (A 1 (- 2 1)))))))))))
= (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 1 1))))))))))
= (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 2)))))))))
= (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (* 2 2)))))))))
= (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 4))))))))
= (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (* 2 4))))))))
= (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 8)))))))
= (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (* 2 8)))))))
= (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 16))))))
= (A 0 (A 0 (A 0 (A 0 (A 0 (* 2 16))))))
= (A 0 (A 0 (A 0 (A 0 (A 0 32)))))
= (A 0 (A 0 (A 0 (A 0 (* 2 32)))))
= (A 0 (A 0 (A 0 (A 0 64))))
= (A 0 (A 0 (A 0 (* 2 64))))
= (A 0 (A 0 (A 0 128)))
= (A 0 (A 0 (* 2 128)))
= (A 0 (A 0 256))
= (A 0 (A 0 256))
= (A 0 (* 2 256))
= (A 0 512)
= (* 2 512)
= 1024
Now let's remember that (A 1 10) evaluates to 1024. It'll come in handy in a minute!
  (A 2 4)
= (A (- 2 1) (A 2 (- 4 1))
= (A 1 (A 2 3))
= (A 1 (A (- 2 1) (A 2 (- 3 1))))
= (A 1 (A 1 (A 2 2)))
= (A 1 (A 1 (A (- 2 1) (A 2 (- 2 1)))))
= (A 1 (A 1 (A 1 (A 2 1))))
= (A 1 (A 1 (A 1 2)))
= (A 1 (A 1 (A (- 1 1) (A 1 (- 2 1)))))
= (A 1 (A 1 (A 0 (A 1 1))))
= (A 1 (A 1 (A 0 2)))
= (A 1 (A 1 (* 2 2)))
= (A 1 (A 1 4))
= (A 1 (A 1 4))
= (A 1 (A (- 1 1) (A 1 (- 4 1))))
= (A 1 (A 0 (A 1 3)))
= (A 1 (A 0 (A (- 1 1) (A 1 (- 3 1)))))
= (A 1 (A 0 (A 0 (A 1 2))))
= (A 1 (A 0 (A 0 (A (- 1 1) (A 1 (- 2 1))))))
= (A 1 (A 0 (A 0 (A 0 (A 1 1)))))
= (A 1 (A 0 (A 0 (A 0 2))))
= (A 1 (A 0 (A 0 (* 2 2))))
= (A 1 (A 0 (A 0 4)))
= (A 1 (A 0 (* 2 4)))
= (A 1 (A 0 8))
= (A 1 (* 2 8))
= (A 1 16)
= (A (- 1 1) (A 1 (- 16 1)))
= (A 0 (A 1 15))
= (A 0 (A (- 1 1) (A 1 (- 15 1))))
= (A 0 (A 0 (A 1 14)))
= (A 0 (A 0 (A (- 1 1) (A 1 (- 14 1)))))
= (A 0 (A 0 (A 0 (A 1 13))))
= (A 0 (A 0 (A 0 (A (- 1 1) (A 1 (- 13 1))))))
= (A 0 (A 0 (A 0 (A 0 (A 1 12)))))
= (A 0 (A 0 (A 0 (A 0 (A (- 1 1) (A 1 (- 12 1)))))))
= (A 0 (A 0 (A 0 (A 0 (A 0 (A 1 11))))))
= (A 0 (A 0 (A 0 (A 0 (A 0 (A (- 1 1) (A 1 (- 11 1))))))))
= (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 1 10)))))))
And substituting in the result from the previous evaluation gives:
= (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 1024))))))
= (A 0 (A 0 (A 0 (A 0 (A 0 (* 2 1024))))))
= (A 0 (A 0 (A 0 (A 0 (A 0 2048)))))
= (A 0 (A 0 (A 0 (A 0 (* 2 2048)))))
= (A 0 (A 0 (A 0 (A 0 4096))))
= (A 0 (A 0 (A 0 (* 2 4096))))
= (A 0 (A 0 (A 0 8192)))
= (A 0 (A 0 (* 2 8192)))
= (A 0 (A 0 16384))
= (A 0 (* 2 16384))
= (A 0 32768)
= (* 2 32768)
= 65536
And again, it may be judicious to remember that (A 2 4) evaluates to 65536...
  (A 3 3)
= (A (- 3 1) (A 3 (- 3 1)))
= (A 2 (A 3 2))
= (A 2 (A (- 3 1) (A 3 (- 2 1))))
= (A 2 (A 2 (A 3 1)))
= (A 2 (A 2 2))
= (A 2 (A (- 2 1) (A 2 (- 2 1))))
= (A 2 (A 1 (A 2 1)))
= (A 2 (A 1 2))
= (A 2 (A (- 1 1) (A 1 (- 2 1))))
= (A 2 (A 0 (A 1 1)))
= (A 2 (A 0 2))
= (A 2 (* 2 2))
= (A 2 4)
= 65536
So what about the definitions of f, g and h?

How about if we try subsititution... We need to remember that we're to provide definitions for all positive integer values of n. I.e. n > 0. Here's f:
  (f n)
= (A 0 n)
= (cond ((= n 0) 0)
        ((= 0 0) (* 2 n))
        ((= n 1) 2)
        (else (A (- 0 1)
                 (A 0 (- n 1)))))
= (cond (#f 0)
        ((= 0 0) (* 2 n))
        ((= n 1) 2)
        (else (A (- 0 1)
                 (A 0 (- n 1)))))
= (cond (#f 0)
        (#t (* 2 n))
        ((= n 1) 2)
        (else (A (- 0 1)
                 (A 0 (- n 1)))))
= (* 2 n)
So (f n) computes 2n.

Now for g. First, let's evaluate a base case, when n = 1.
  (g 1)
= (A 1 1)
= (cond ((= 1 0) 0)
        ((= 1 0) (* 2 1))
        ((= 1 1) 2)
        (else (A (- 1 1)
                 (A 1 (- 1 1))))))
= (cond (#f 0)
        ((= 1 0) (* 2 1))
        ((= 1 1) 2)
        (else (A (- 1 1)
                 (A 1 (- 1 1))))))
= (cond (#f 0)
        (#f (* 2 1))
        ((= 1 1) 2)
        (else (A (- 1 1)
                 (A 1 (- 1 1))))))
= (cond (#f 0)
        (#f (* 2 1))
        (#t 2)
        (else (A (- 1 1)
                 (A 1 (- 1 1))))))
= 2
And now for any n > 1:
  (g n)
= (A 1 n)
= (cond ((= n 0) 0)
        ((= 1 0) (* 2 n))
        ((= n 1) 2)
        (else (A (- 1 1)
                 (A 1 (- n 1))))))
= (cond (#f 0)
        ((= 1 0) (* 2 n))
        ((= n 1) 2)
        (else (A (- 1 1)
                 (A 1 (- n 1))))))
= (cond (#f 0)
        (#f (* 2 n))
        ((= n 1) 2)
        (else (A (- 1 1)
                 (A 1 (- n 1))))))
= (cond (#f 0)
        (#f (* 2 n))
        (#f 2)
        (else (A (- 1 1)
                 (A 1 (- n 1))))))
= (A (- 1 1) (A 1 (- n 1)))
= (A 0 (A 1 (- n 1)))
= (* 2 (A 1 (- n 1)))
= (* 2 (g (- n 1)))
Let's stop here a minute... When n is 1, (g n) computes 2, and when n is greater than 1, (g n) computes 2 times (g (- n 1)). In order words, (g n) computes the result of multiplying together n 2s, or to put it more succinctly, it computes 2n.

Finally, let's try h. The base case is easy enough, as it's almost identical to that of g:
  (h 1)
= (A 2 1)
= (cond ((= 1 0) 0)
        ((= 2 0) (* 2 1))
        ((= 1 1) 2)
        (else (A (- 1 1)
                 (A 1 (- 2 1))))))
= (cond (#f 0)
        ((= 2 0) (* 2 1))
        ((= 1 1) 2)
        (else (A (- 1 1)
                 (A 1 (- 2 1))))))
= (cond (#f 0)
        (#f (* 2 1))
        ((= 1 1) 2)
        (else (A (- 1 1)
                 (A 1 (- 2 1))))))
= (cond (#f 0)
        (#f (* 2 1))
        (#t 2)
        (else (A (- 1 1)
                 (A 1 (- 2 1))))))
= 2
However, for for any n > 1 is a "bit more tricky":
  (h n)
= (A 2 n)
= (cond ((= n 0) 0)
        ((= 2 0) (* 2 n))
        ((= n 1) 2)
        (else (A (- n 1)
                 (A n (- 2 1)))))
= (cond (#f 0)
        ((= 2 0) (* 2 n))
        ((= n 1) 2)
        (else (A (- n 1)
                 (A n (- 2 1)))))
= (cond (#f 0)
        (#f (* 2 n))
        ((= n 1) 2)
        (else (A (- n 1)
                 (A n (- 2 1)))))
= (cond (#f 0)
        (#f (* 2 n))
        (#f 2)
        (else (A (- n 1)
                 (A n (- 2 1)))))
= (A (- n 1) (A n (- 2 1)))
= (A (- n 1) (A n 1))
... and so on. Let's not do that.

Instead, let's try some substitutions for small values of n. We know already that (h 1) is 2, so let's try with the subsequent values of n... And you'll forgive me, I hope, if skip the step-by-step evaluations of cond operators...
  (h 2)
= (A 2 2)
= (A (- 2 1) (A 2 (- 2 1)))
= (A 1 (A 2 1))
= (A 1 2)
= (g 2)
We know that (g n) computes 2n, so (g 2) (and therefore (h 2)) computes 22.
  (h 3)
= (A 2 3)
= (A (- 2 1) (A 2 (- 3 1)))
= (A 1 (A 2 2))
= (A 1 4))
= (g 4)
= 24
= 16
Note that we substitutute 4 for (A 2 2). This comes from the evaluation of (h 2) above. We'll apply similar substitutions below.
  (h 4)
= (A 2 4)
= (A (- 2 1) (A 2 (- 4 1)))
= (A 1 (A 2 3))
= (A 1 (g 4))
= (A 1 16)
= (g 16)
= 216
= 65536

  (h 5)
= (A 2 5)
= (A (- 2 1) (A 2 (- 5 1)))
= (A 1 (A 2 4))
= (A 1 65536)
= (g 65536)
= 265536
= a very big number
Okay, I think that's enough.

(h n) calculates 2(h (- n 1)), with (h 1) evaluating to 2. So (h 2) evaluates to 22, (h 3) evaluates to 222, (h 4) evaluates to 2222 and so on. This type of iterative exponentation is known as tetration, which thankfully has a nice concise way of being expressed: (h n) computes n2.

SICP Exercise 1.9: Evaluation by Substitution

Each of the following two procedures defines a method for adding two positive integers in terms of the procedures inc, which increments its argument by 1, and dec, which decrements its argument by 1.
(define (+ a b)
  (if (= a 0)
      b
      (inc (+ (dec a) b))))

(define (+ a b)
  (if (= a 0)
      b
      (+ (dec a) (inc b))))
Using the substitution model, illustrate the process generated by each procedure in evaluating (+ 4 5). Are these processes iterative or recursive?

The joy of evaluating by hand...

Here's what happens when we evaluate (+ 4 5) using the first procedure:
  (+ 4 5)
= (inc (+ (dec 4) 5))
= (inc (+ 3 5))
= (inc (inc (+ (dec 3) 5)))
= (inc (inc (+ 2 5)))
= (inc (inc (inc (+ (dec 2) 5))))
= (inc (inc (inc (+ 1 5))))
= (inc (inc (inc (inc (+ (dec 1) 5)))))
= (inc (inc (inc (inc (+ 0 5)))))
= (inc (inc (inc (inc 5))))
= (inc (inc (inc 6)))
= (inc (inc 7))
= (inc 8)
= 9
As can be seen by the chain of deferred inc operations that build up during the evaluation of this procedure, it's recursive.

And here's what happens when we evaluate (+ 4 5) using the second procedure:
  (+ 4 5)
= (+ (dec 4) (inc 5))
= (+ 3 6)
= (+ (dec 3) (inc 6))
= (+ 2 7)
= (+ (dec 2) (inc 7))
= (+ 1 8)
= (+ (dec 1) (inc 8))
= (+ 0 9)
= 9
Here we have no chain of deferred operations building up - each dec and inc is evaluated immediately. As a result, the second procedure is iterative.

SICP Exercise 1.8: Cube Roots

Newton's method for cube roots is based on the fact that if y is an approximation to the cube root of x, then a better approximation is given by the value
Use this formula to implement a cube-root procedure analogous to the square-root procedure.

The sqrt procedure from last exercise is pretty close to what we require here. We just need a different implementation of improve that implements the above formula and a modified version of good-enough? that uses the cube-root version of improve to test the guesses.

I'm assuming the same definition of square from exercise 1.3. Given that, here's what I came up with:
(define (good-enough-cbrt? guess x)
  (< (abs (- guess (improve-cbrt guess x))) (/ guess 1000000)))

(define (improve-cbrt guess x)
  (/ (+ (/ x (square guess)) (* 2 guess)) 3))

(define (cbrt-iter guess x)
  (if (good-enough-cbrt? guess x)
      guess
      (cbrt-iter (improve-cbrt guess x)
                 x)))

(define (cbrt x) (cbrt-iter 1.0 x))
And here's it in use:
> (cbrt (* 3 3 3))
3.0000005410641766
> (cbrt 0.5)
0.7937005260076354
> (* (cbrt 0.5) (cbrt 0.5) (cbrt 0.5))
0.5000000000444796

SICP Exercise 1.7: Not Quite Good-Enough Square Roots

The good-enough? test used in computing square roots will not be very effective for finding the square roots of very small numbers. Also, in real computers, arithmetic operations are almost always performed with limited precision. This makes our test inadequate for very large numbers. Explain these statements, with examples showing how the test fails for small and large numbers. An alternative strategy for implementing good-enough? is to watch how guess changes from one iteration to the next and to stop when the change is a very small fraction of the guess. Design a square-root procedure that uses this kind of end test. Does this work better for small and large numbers?

Here we're using the following procedure definitions from section 1.1.7:
(define (average x y)
  (/ (+ x y) 2))

(define (improve guess x)
  (average guess (/ x guess)))

(define (good-enough? guess x)
  (< (abs (- (square guess) x)) 0.001))

(define (sqrt-iter guess x)
  (if (good-enough? guess x)
      guess
      (sqrt-iter (improve guess x)
                 x)))

(define (sqrt x)
  (sqrt-iter 1.0 x))

"Very small" numbers (i.e. numbers <0.001), have correspondingly "very small" square roots. As a result, good-enough? is not precise enough to allow sqrt-iter to loop enough times to get to an accurate solution for such numbers. As x tends towards 0, the guess at which good-enough? will return #t tends towards 0.03125, the square root of 0.0009765625 (which is less than 0.001). For example, using DrRacket:
> (sqrt 0.0000000000000000004)
0.03125
It should be noted, of course, that assuming that the usual fixed-width mantissa and exponent method of representing floating point numbers is used by the interpreter, it is not always possible for the interpreter to perform calculations with two numbers that differ by many orders of magnitude and still retain precision. As the initial guess is set at 1.0, when the x passed is small enough, as in the example above, this loss of precision will result in the initial invocation of improve evaluating to 0.5, and thereafter it will follow the path to 0.03125.

Further examples of how the approximation tends towards 0.03125 follow:
> (sqrt 0.001)
0.04124542607499115
> (sqrt 0.0001)
0.03230844833048122
> (sqrt 0.00001)
0.03135649010771716
> (sqrt 0.000001)
0.031260655525445276
> (sqrt 0.0000001)
0.03125106561775382
> (sqrt 0.00000001)
0.03125010656242753
> (sqrt 0.000000001)
0.03125001065624928
> (sqrt 0.0000000001)
0.03125000106562499

The inability to accurately perform calculations with numbers that are many orders of magnitude different also affects the calculation of sqrt with very large numbers. However, in this case the problem is that it is not possible for interpreter to represent a number that is within 0.001 of the original x, unless it is x itself. As a result, unless the evaluation of improve results in the exact square root of x and the subsequent calculation of (square guess) in good-enough? does manage to accurately produce x, an infinite loop will occur.

For example, using DrRacket, (sqrt 11730317098663263) results in an infinite loop.

To resolve this issue we could redefine good-enough? as follows so that it considers a guess good enough if refining the guess via improve changes the guess by less than a millionth of the guess:
(define (good-enough? guess x)
  (< (abs (- guess (improve guess x))) (/ guess 1000000)))
Using this redefined good-enough? gives, for example, the following results:
> (sqrt 11730317098663263)
22373979042.98612
> (sqrt 0.0000000001)
9.999999999999999e-006

SICP Exercise 1.6: A New If

Alyssa P. Hacker doesn't see why if needs to be provided as a special form. "Why can't I just define it as an ordinary procedure in terms of cond?" she asks. Alyssa's friend Eva Lu Ator claims this can indeed be done, and she defines a new version of if:
(define (new-if predicate then-clause else-clause)
  (cond (predicate then-clause)
        (else else-clause)))
Eva demonstrates the program for Alyssa:
(new-if (= 2 3) 0 5)
5
(new-if (= 1 1) 0 5)
0
Delighted, Alyssa uses new-if to rewrite the square-root program:
(define (sqrt-iter guess x)
  (new-if (good-enough? guess x)
          guess
          (sqrt-iter (improve guess x)
                     x)))
What happens when Alyssa attempts to use this to compute square roots? Explain.

In order to evaluate sqrt-iter the interpreter must evaluate new-if. Now as Scheme uses applicative-order evaluation (for non-special forms), prior to evaluating new-if the interpreter will attempt to evaluate all three of the predicate, then-clause and else-clause operands. Note that this means that an attempt will be made to evaluate both the then-clause and else-clause operands, regardless of what the predicate evaluates to.

When the interpreter tries to evaluate the else-clause an infinite recursion occurs. This happens as evaluating the else-clause operand requires evaluation of a further sqrt-iterator expression which, of course, requires a further evaluation the else-clause operand, and so on.

SICP Exercise 1.5: Applicative-Order and Normal-Order Evaluation

Ben Bitdiddle has invented a test to determine whether the interpreter he is faced with is using applicative-order evaluation or normal-order evaluation. He defines the following two procedures:
(define (p) (p))
(define (test x y)
  (if (= x 0)
      0
      y))
Then he evaluates the expression
(test 0 (p))
What behavior will Ben observe with an interpreter that uses applicative-order evaluation? What behavior will he observe with an interpreter that uses normal-order evaluation? Explain your answer. (Assume that the evaluation rule for the special form if is the same whether the interpreter is using normal or applicative order: The predicate expression is evaluated first, and the result determines whether to evaluate the consequent or the alternative expression.)

In the case of applicative-order evaluation, the interpreter attempts to evaluate the operands before evaluating the operator. When it tries to evaluate the second operand of the expression (test 0 (p)) (i.e. (p)) prior to applying the operator test it enters an infinite evaluation loop and so either does not terminate or runs out of resources (depending upon how the interpreter is implemented). The infinite evaluation loop occurs when evaluating p because the value of p is defined to be the value of evaluating p, so in order to evaluate p the interpreter must first evaluate p!

In the case of normal-order evaluation, the interpreter does not try to evaluate any operands until they are required in order to evaluate the value of the operator. So under normal-order evaluation, neither of the operands to test are evaluated prior to evaluating test. Instead, the interpreter starts by evaluating the body of test, which leads it to evaluate an if conditional expression. In order to do this it must first evaluate the predicate, which causes the interpreter to evaluate the first operand to test. This evaluates to 0 and so the predicate evaluates to #t (as 0 = 0). As the predicate is true, the consequent expression is then evaluated and so the overall expression yields 0. No attempt is made to evaluate the alternative expression, (p), and so no infinite evaluation loop results.

SICP Exercise 1.4: Compound Expressions as Operators

Observe that our model of evaluation allows for combinations whose operators are compound expressions. Use this observation to describe the behavior of the following procedure:
(define (a-plus-abs-b a b)
  ((if (> b 0) + -) a b))
The procedure a-plus-abs-b determines whether or not b is negative and uses this to determine whether to respectively subtract or add the value of b from a.

As subtracting a negative value is equivalent to adding the absolute value of that value, and adding a positive value is equivalent to adding the absolute value of that value, the behaviour of the procedure is equivalent to adding the absolute value of b to a - the behaviour documented by the procedure name.

SICP Exercise 1.3: Summing Squares

Define a procedure that takes three numbers as arguments and returns the sum of the squares of the two larger numbers.

From section 1.1.4 of SICP we already have:
(define (square x) (* x x))
(define (sum-of-squares x y)
  (+ (square x) (square y)))
So all we need to do is determine which of three numbers are the largest two and then evaluate sum-of-squares for those. An alternate way of finding the largest two of three numbers is to find the smallest number - and then the other two numbers are the larger ones. That's how I chose to do it:
(define (sum-of-squares-of-top-two a b c)
  (cond ((and (< a b) (< a c)) (sum-of-squares b c))
        ((and (< b a) (< b c)) (sum-of-squares a c))
        (else (sum-of-squares a b))))
And here it is in action:
> (sum-of-squares-of-top-two 2 3 4)
25
> (sum-of-squares-of-top-two 5 4 3)
41
> (sum-of-squares-of-top-two 6 4 2)
52

SICP Exercise 1.2: From Infix to Prefix

Translate the following expression into prefix form

Oh the joys of producing equations in MS Word... Anyway, I translate this as:
(/ (+ 5
      4
      (- 2
         (- 3
            (+ 6
               (/ 4 5)))))
   (* 3
      (- 6 2)
      (- 2 7)))
The result of interpreting this is interesting:
Rather than converting the integer values to decimal fractions (or, more likely, floating point numbers), Scheme represents the result as a rational number.

SICP Exercise 1.1: Interpreter Evaluation

The first exercise is simply a case of evaluating some expressions in your chosen Scheme interpreter and recording the output.  Nothing overly complex here, apart from chosing your interpreter.  After hunting around for a while I settled on the DrRacket environment, which includes support for both the Racket dialect of Scheme and the R6RS scheme standard.  DrRacket provides a development environment incorporating an editor, interpreter, debugger and other bits and bobs.  The editor includes all the usual editory features you'd expect, but it's worth noting that it provides syntax highlighting, auto-indentation, and bracket matching.  The last of these is pretty crucial when writing any LISP-dialect code, and the others help a lot too.

For various reasons I've also tinkered with the Kawa interpreter.  However, I found it to have some "interesting idiosyncrasies", such as scoping issues and buggy interpreter interactions, so I try to stick to DrRacket now.

Anyway, without further ado, exercise 1.1.

Below is a sequence of expressions.  What is the result printed by the interpreter in response to each expression?  Assume that the sequence is to be evaluated in the order in which it is presented.

> 10
10
Expressions consisting of only digits are evaluated to the decimal value they represent.

> (+ 5 3 4)
12
Scheme uses prefix notation (also known as Polish notation) for its expressions.  Unlike the infix notation we know and love from our maths lessons at school, in prefix notation the operator comes first within the parenthesised expression with the operands following after, so this expression is equivalent to 5 + 3 + 4 in infix notation.  If you've never prefix notation before then it can take a little while to get used to.

> (- 9 1)
8
This is equivalent to 9 - 1 in infix notation. It's worth noting that you can apply more than two operands to the - operator. Scheme's interpretation of such expressions is equivalent to taking the first operand as the initial value and then subtracting each and every subsequent operand in turn from the value. I.e. (- 10 5 3 1) is equivalent to 10 - 5 - 3 - 1 in infix notation.

> (/ 6 2)
3
This is equivalent to 6 / 2 in infix notation. Note that division has an equivalent interpretation to subtraction when there are more than two operands. I.e. (/ 15 5 3) is equivalent to (15 / 5) / 3 in infix notation.

> (+ (* 2 4) (- 4 6))
6
This is equivalent to (2 * 4) + (4 - 6) in infix notation.

> (define a 3)
As noted in the text, the result printed by the interpreter in response to evaluating the define operator depends greatly on the interpreter in use. DrRacket doesn't print anything. However, we've now associated the value 3 with the variable a...

> (define b (+ a 1))
...and the value 4 with the variable b. Obviously the order of evaluation is critial here. As the definition of b relies upon the variable a, the expression which defines a must preceed the expression defining b!

> (+ a b (* a b))
19
This is equivalent to a + b + (a * b) in infix notation, and relies upon the preceeding definitions of a (= 3) and b (= 4).

> (= a b)
#f
Boolean values are represented in Scheme by #t and #f for true and false respectively. The book notes a couple of interesting things here:
  • You can also use true and false (although evaluating these returns #t and #f respectively).
  • When the interpreter evaluates a predicate it evaluates #f as a false value and any other value as true. Which means that #t and true aren't strictly necessary.

> (if (and (> b a) (< b (* a b)))
      b
      a)
4
Our first conditional expression. The interpreter first evaluates the predicate (the first operand). If it evaluates to true* then it returns the value produced by evaluating the consequent (the second operand). If, on the other hand, the predicate evaluates to false then it returns the value produced by evaluating the alternative (the third operand). In this case the predicate evaluates to true (as, using substitution, (> b a) = (> 4 3) = #t), so it returns the value produced by evaluating the consequent.

*Of course, when we say "true" we mean any value other than #f.

> (cond ((= a 4) 6)
        ((= b 4) (+ 6 7 a))
        (else 25))
16
The cond operand is a multi-branch conditional expression. Each operand in a correctly constructed cond expression consists of a pair of sub-expressions. The first being a predicate and the second being an arbitrary expression. The interpreter's evaluation of a cond expression is equivalent to starting at the first operand and then performing the following set of steps:
  1. Evaluate the predicate.
  2. If the predicate evaluates to true then return the value produced by evaluating the corresponding arbitrary expression. Do not evaluate any further operands in the cond expression.
  3. If the predicate evaluates to false then move onto the next operand and repeat the procedure.
What happens if none of the predicates evaluate to true? In the case of DrRacket it returns an empty (void) value.

Anyway, in this exercise, the predicate for the first operand evaluates to false (a equals 3, not 4), so it falls through to the second operand where the predicate evaluates to true (b does equal 4) and so the the second operand's arbitrary expression, (+ 6 7 a), is evaluated and its result is returned as the value of the expression.

> (+ 2 (if (> b a) b a))
6
In this expression, the if conditional is embedded within an outer expression so in order to interpret the latter the interpreter must first interpret the former. As b is greater than a the predicate evaluates to true and so the consequent of the if operator is evaluated and returned as its value (= b = 4), which then allows the outer expression to be evaluated as (+ 2 4).

> (* (cond ((> a b) a)
           ((< a b) b)
           (else -1))
     (+ a 1))
16
The final expression to be evaluated again embeds a conditional expression (a cond this time). The predicate for the second operand is the first one that evaluates to true, so the whole cond expression evaluates to 4. However, before the interpreter can evaluate the * expression it must also evaluate its second operand as well. (+ a 1) = (+ 3 1) = 4, so the * expression can be evaluated as (* 4 4).

Phew! That took a while to write up. And I know that quite a few of the ones between here and where I've got to so far have a lot of text in them too. It may take me a while to catch-up...

A Day of SICP

So I happen to have the day off today. It's almost like I planned it, what with the first post last night. Anyway, I'm going to have a shot at posting up as many of my answers to the exercises that I've done from SICP so far. That's only 76 exercises - how hard can it be?...

2011-08-29

SICP

It's always worthwhile brushing up on the fundamentals once in a while.  To this end some of my colleagues and I have been working through Structure and Interpretation of Computer Programs, a.k.a. SICP, since it's oft noted as a classic text.  I'm going to do my best to join the raft of existing blogs out there that document people's progress through this.

We've already got through chapter 1, Building Abstractions with Procedures, and are about half way through chapter 2, Building Abstractions with Data.  In case you're not familiar with the text it's only got 5 chapters, despite weighing in at just over 600 pages, so I've quite a bit of catching-up to do.

And then for my next trick, I'll finally get around to restarting, and properly writing up, TAOCP.