2.1 Introduction to Data Abstraction

In this chapter we are going to look at more complex data. All the procedures in Chapter 1 operate on simple numerical data. The focus in Chapter 1 was on building abstractions by combining procedures to form compound procedures, in this chapter we will explore the means for building abstractions by combining data objects to form compound data.

2.1.1 Example: Arithmetic Operations for Rational Numbers

The basic idea of data abstraction is to structure the programs that are to use compound data objects so that they operate on “abstract data”. That is, our programs should use data in such a way as to make no assumptions about the data that are not strictily necessary for performing the task at hand. At the same time, a “concrete” data representation is defined independent of the program that uses the data. The interface between these two part of our system will be a set of procedures, called selectors and constructors, that implement the abstract data in terms of the concrete representation.

Pairs

Our language provides a compound structure called a pair, which can be constructed with the primitive procedure cons. This procedure takes two arguments and returnas a compound data object that contains the two arguments as parts. Given a pair we can extract the parts using the primitive procedures car and cdr (pronounced “could-er”). Thus we can use cons, car and cdr as follows:

(define x (cons 1 2))

(car x)
; 1

(cdr x)
; 2

Moreover, cons can be used to form pairs whose elements are pairs, and so on:

(define x (cons 1 2))

(define y (cons 3 4))

(define z (cons x y))

(car (car z))  ; or equivalently (caar z)
; 1

(car (cdr z))  ; or equivalently (cadr z)
; 3

Data objects constructed from pairs are called list-structured data.

Aside: two primitive procedures in Scheme: display is used to print data and newline starts a new line for printing. Neither of these procedures returns a useful value.

So using the ideas above we can represent rational numbers in the following way:

(define (make-rat n d)
    (let ((g (gcd n d)))
         (cons (/ n g) (/ d g))))

(define (numer x) (car x))

(define (denom x) (cdr x))

(define (print-rat x)
    (newline)
    (display (numer x))
    (display "/")
    (display (denom x)))

And we can define the arithmetic operations on the rationals with the following procedures:

(define (add-rat x y)
    (make-rat (+ (* (numer x) (denom y))
                 (* (numer y) (denom x)))
              (* (denom x) (denom y))))

(define (sub-rat x y)
    (make-rat (- (* (numer x) (denom y))
                 (* (numer y) (denom x)))
              (* (denom x) (denom y))))

(define (mul-rat x y)
    (make-rat (* (numer x) (numer y))
              (* (denom x) (denom y))))

(define (div-rat x y)
    (make-rat (* (numer x) (denom y))
              (* (denom x) (numer y))))

(define (equal-rat? x y)
    (= (* (numer x) (denom y))
       (* (numer y) (denom x))))

Demonstrating a couple of these procedures:

(define one-third (make-rat 1 3))

(print-rat (add-rat one-third one-third))
; 2/3

2.1.2 Abstraction Barriers

We defined the rational-number operations in terms of a constructor make-rat and selectors numer and denom. In general, the underlying idea of data abstraction is to identify for each type of data object a basic set of operations in terms of which all manipulations of data objects of that type will be expressed, and then to use only those operations in manipulating the data.

2.1.3 What Is Meant by Data?

In general we can think of data as defined by some collection of selectors and constructors, together with specified conditions that these procedures must fulfill in order to be a valid representation.

For example, in the rational-number system, make-rat, numer, and denom must satisfy the condition that if x is (make-rat n d), then (/ (numer x) (denom x)) = (/ n d).

This point of view can serve to define not only “high-level” data objects, such as rational numbers, but lower level objects as well. Consider the notion of a pair, which we used in order to define our rational numbers. We never actually said what a pair was, only that the language supplied procedures cons, cars, and cdr for operating on pairs. But the only thing we need to know about these three operations is that if we glue two objects together using cons we can retrieve the objects using car and cdr. That is, the operations satisfy the condition that, for any objects x and y, if z is (cons x y) then (car z) is x and (cdr z) is y. Indeed, we mention that these three procedures are included as primitives in our language. However, any triple of procedures that satisfies the above condition can be used as the basis for implementing pairs. This point is illustrated strinkingly by the fact that we could implement cons, cars and cdr without using any data structures at all but only using procedures. Here are the definitions:

(define (cons x y)
    (define (dispatch m)
        (cond ((= m 0) x)
              ((= m 1) y)
              (else (error "Argument not 0 or 1 - CONS"))))
     dispatch)
(define (car z) (z 0))
(define (cdr z) (z 1))

This use of procedures corresponds to nothing like our intuitive notion of what data should be. Nevertheless, all we need to do to show that this is a vaild way to represent pairs is to verify that theses procedures satisfy the condition given above.

The subtle point to notice is that the value returned by (cons x y) is a procedure – namely that internally defined procedure dispatch.