The following eight-symbol alphabet with associated relative frequencies was designed to efficiently encode the lyrics of 1950s rock songs. (Note that the ``symbols'' of an ``alphabet'' need not be individual letters.)
A | 2 | NA | 16 |
BOOM | 1 | SHA | 3 |
GET | 2 | YIP | 9 |
JOB | 2 | WAH | 1 |
Use
generate-huffman-tree
(exercise 2.69) to generate a corresponding Huffman tree, and use encode (exercise 2.68) to encode the following message:Get a job
Sha na na na na na na na na
Get a job
Sha na na na na na na na na
Wah yip yip yip yip yip yip yip yip yip
Sha boom
How many bits are required for the encoding? What is the smallest number of bits that would be needed to encode this song if we used a fixed-length code for the eight-symbol alphabet?
Okay, first let's build our Huffman tree:
> (define fifties-song-tree (generate-huffman-tree '((A 2) (BOOM 1) (GET 2) (JOB 2) (NA 16) (SHA 3) (YIP 9) (WAH 1)))) > fifties-song-tree '((leaf NA 16) ((leaf YIP 9) (((leaf A 2) ((leaf WAH 1) (leaf BOOM 1) (WAH BOOM) 2) (A WAH BOOM) 4) ((leaf SHA 3) ((leaf JOB 2) (leaf GET 2) (JOB GET) 4) (SHA JOB GET) 7) (A WAH BOOM SHA JOB GET) 11) (YIP A WAH BOOM SHA JOB GET) 20) (NA YIP A WAH BOOM SHA JOB GET) 36)And now let's encode our song:
> (define encoded-song (encode '(GET A JOB SHA NA NA NA NA NA NA NA NA GET A JOB SHA NA NA NA NA NA NA NA NA WAH YIP YIP YIP YIP YIP YIP YIP YIP YIP SHA BOOM) fifties-song-tree)) > encoded-song '(1 1 1 1 1 1 1 0 0 1 1 1 1 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 1 0 1 1 0 1 1) > (length encoded-song) 84So we need 84 bits to encode the song using our Huffman tree.
How about if we use a fixed-length code? Well, we have eight symbols to encode so, using a fixed-length code, we could encode the message using 3 bits per symbol (as 23=8). As there are a total of 36 symbols to encode, that means we could encode this using a fixed-length code using 36×3=108 bits.
So using Huffman encoding has saved us 24 bits - a 22% reduction in the size of the message.
No comments:
Post a Comment