Workshop on Scheme

An overview of Scheme

Program structure

A Scheme program is a sequence of zero or more definitions and commands. The definitions and commands can be interleaved in any order, provided that no variable is evaluated before it has been defined. In batch mode, the usual program structure is a sequence of definitions followed by a single command, typically a call to a defined procedure. In interactive mode, it is usual for the programmer to give commands after each new procedure or group of procedure has been submitted, in order to test and debug the procedures.

Scheme has lexical scope and block structure. A procedure definition can contain other, local, procedure definitions, which can contain still other definitions, and so on without limit. Moreover, variables can acquire local bindings without the explicit definition of a procedure, bindings that are accessible only inside a single expression. The scopes of such bindings can be similarly nested without limit.

A comment begins with one or more semicolons and ends at the end of the line. All semicolons after the first are rhetorical; accumulations of two, three, and four semicolons are traditionally used to signal header comments for successively larger blocks of code. It is traditional to write the comment for a definition above the definition itself. ``Wing comments'' (beginning in the middle of a line, following code on the same line) are traditionally used to signal subtle or unexpected coding choices.

Definitions

Superficially, there are two kinds of definitions: Variable definitions, which allocate and initialize storage for global variables, and procedure definitions, which extend Scheme's repertoire of predefined procedures. However, procedure names in Scheme are merely variables that happen to have procedures as values, so in fact all definitions have the same underlying semantics.

Commands

Syntactically, every command in Scheme is an expression (the command is: evaluate this expression!). Every Scheme expression has a value. Some, notably assignment expressions and calls to procedures that perform input and output, also have side effects, and the programmer may issue a command either in order to obtain the value of the expression or to accomplish the side effect.

The predefined Scheme procedures that have side effects return values that are deliberately unspecified by the Scheme standard. Implementers frequently arrange for such procedures to return values not belonging to any of the usual Scheme data types, so that programmers will not be able to write non-portable Scheme code that incorrectly relies on one implementation's choice of return values. Authors of procedure libraries should, as a matter of style, follow the same practice.

Data types

Scheme recognizes values of the following nine data types, which are guaranteed to be mutually exclusive. Boolean and character values are exactly like their counterparts in Pascal.

Pairs are two-field records in which the value stored in each field can (independently) be of any type. Since a pair can be stored in either field of another pair, one can get the effect of linked structures without explicit pointers. Of course, pointer wizardry must be used behind the scenes in the implementation.

Storage allocation and deallocation is, however, handled entirely by the run-time system, more or less invisibly. The Scheme programmer's perspective is that storage is allocated automatically and never deallocated. (According to the standard, the run-time system is supposed to deallocate storage only if it can prove that it is impossible for the program ever to access that storage again.)

The null object is a one-of-a-kind value used mainly as a sentinel or a conventional failure signal. It is somewhat like Pascal's nil value.

Symbols are data values of completely arbitrary significance, indistinguishable except for their names. They are somewhat like values of enumerated types in Pascal, except that they need not be declared before use and are not ordered.

Numbers are subclassified into four subtypes, each embedded in the next: integers, rational numbers, real numbers, and complex numbers. An implementation of Scheme must provide integers, but the other subtypes are optional. (The Scheme standard also allows implementers to provide other kinds of numbers but does not describe or constrain their attributes.)

The integer subtype differs from Pascal's integer type in that there is no maximum or minimum value. A few implementations of Scheme provide only a finite range of integers, which is technically permitted by the standard but strongly deprecated.

Typically, real numbers are equivalent to the local C compiler's version of the double data type, with all the attendant problems of limited range and precision.

Scheme distinguishes between ``exact'' numbers, for which the internal representation is guaranteed to be mathematically perfect, and ``inexact'' numbers, the internal representations for which may be only approximate. Theoretically, this distinction is independent of the subtyping system. In practice, many implementations treat all real and complex numbers as inexact while attempting to keep all integer and rational values exact whenever possible.

Strings are sequences of 0 or more characters. There is no maximum string length. One-character strings are distinct from the characters they contain.

Vectors are one-dimensional arrays of values of any types, not necessarily homogeneous. Indexing is zero-based, as in C.

The procedure data type includes both what a Pascal programmer would call functions and what she would call procedures; a Pascal procedure corresponds to a Scheme procedure with an irrelevant return value that is typically discarded. Scheme supplies many procedures as built-ins, and the programmer can explicitly define still others, but there is a third possibility for which there is no analogue in Pascal or C: A procedure can be constructed as a value during the execution of the program itself. It can, for instance, be returned as the value of another procedure. Such dynamically constructed procedures can then be invoked immediately or stored in variables and invoked later, still under program control and without the intervention of the programmer.

This possibility has far-reaching consequences, profoundly affecting the way a Scheme programmer approaches certain problems and the style in which may of her programs are written.

The Scheme standard also recognizes two other data types, ports (sources or sinks for data values, usually files) and the eof object (which an input procedure returns when the port it is reading from can supply no more values). The standard does not specify whether these types are distinct from values of the other types.

Expressions

There are several kinds of Scheme expressions:

Parameter passing

When a procedure is called, the arguments are always passed by value, and a similar mechanism is used when the procedure returns a value to the caller. However, pairs, vectors, and strings are treated as ``containers,'' so that a procedure can accept such a value as an argument, change its contents (either or both of the components of a pair, any or all of the elements of a vector or the characters in a string), and return the same container with the new contents.

From the Pascal or C++ programmer's point of view, the effect is as if arguments of structured types were always passed by reference. The truth underlying this interpretation is that both Pascal's variable-parameters and Scheme's mechanism for passing values of ``container'' types involve the transfer of machine addresses to the called procedure rather than the copying of the data structure's contents into newly allocated storage.

Some Scheme procedures have a fixed arity; others require some minimum number of arguments, but can accept more; still others can take any number of arguments whatever. Programmers can define their own variable-arity procedures. The behavior of a procedure may depend on the number of arguments it receives (e.g., if the / procedure is given two arguments, it returns their quotient; if it is given only one, it returns the reciprocal of its argument).


This document is available on the World Wide Web as

http://www.math.grin.edu/~stone/events/scheme-workshop/overview.html


created July 7, 1995
last revised July 15, 1995

John David Stone (stone@math.grin.edu)