TALLYING BY CATEGORY

Given a data set, one frequently wants to classify the data in some way and to tally the number of data values in each class. In Scheme, it is natural to represent a data set as a list of data values (possibly data structures of some sort) and the classification method as a procedure that takes any datum as an argument and returns a value in the range from 0 to n - 1, where n is the number of different classes of data.

The histogram procedure below takes as arguments a data set, a positive integer indicating the number of data classes, and a classification method, and returns a vector that indicates how many members of each class are in the data set. The algorithm is straightforward: traverse the data set, tallying each datum as it is visited.

(define histogram
  (lambda (data-set number-of-classes criterion)
    (let ((result (make-vector number-of-classes 0)))
      (for-each (lambda (datum)
                  (vector-bump! result (criterion datum)))
                data-set)
      result)))
The vector-bump! procedure called by histogram increments one of the elements of a vector by a given amount (or by 1, if no amount is specified). Here's its definition:
(define vector-bump!
  (lambda (vec index . opt)
    (let ((increment (if (null? opt) 1 (car opt))))
      (vector-set! vec index (+ (vector-ref vec index)
                                increment)))))
The following procedure displays a bar graph displaying the contents of a vector of the sort that the histogram procedure returns. The internally defined procedure print-width determines the number of characters contained in the string representation of a given number.
(define bar-graph
  (let ((print-width
         (lambda (num)
           (string-length (number->string num)))))

    (lambda (vec)
      (let* ((len (vector-length vec))
             (index-width (print-width (- len 1))))
        (do ((index 0 (+ index 1)))
            ((= index len))
          (let ((value (vector-ref vec index)))
            (display (right-justify (number->string index)
                                    index-width))
            (display " | ")
            (display (make-string value #\*))
            (display " (")
            (display value)
            (display ")")
            (newline)))))))
The right-justify procedure takes a string and a desired width as arguments, and returns a newly allocated string of the desired width, formed either by padding the given string with spaces on the left or by truncating the given string (discarding characters from its left end).
(define right-justify
  (lambda (str desired-width)
    (let ((len (string-length str)))
      (if (< len desired-width)
          (string-append (make-string (- desired-width len)
                                      #\space)
                         str)
          (substring str (- len desired-width) len)))))


This document is available on the World Wide Web as

http://www.math.grin.edu/~stone/events/scheme-workshop/histogram.html


created July 12, 1995
last revised June 24, 1996

John David Stone (stone@math.grin.edu)