For this workshop, we will be writing both a scanner and a parser directly in Scheme rather than using scanner/parser-generators such as lex or yacc to do the ``dirty'' work.
If we are to write our own scanner, we need to deal with a sticky issue: input. A scanner should take as its input a ``stream'' of characters. This could be simulated by passing in a string, we will instead use Scheme's input port mechanism.
We will use the following procedures for our input:
(read-char input-port)--> character-or-eof
read-charproduces the next character. If no character is available,
read-charproduces the end-of-file object.
(peek-char input-port)--> character-or-eof
read-chardoes, but does not ``consume'' it, so the next call to
read-charwill return the same character.
(eof-object? character-or-eof)--> boolean
(open-input-file string)--> input-port
(open-input-string string)--> input-port
read-charon the port will return the characters of string, in order. When the characters in string are exhausted, invocations of
peek-charon the port will return the end-of-file object.
> (define ip (open-input-string "hi")) > (peek-char ip) #\h > (peek-char ip) #\h > (read-char ip) #\h > (peek-char ip) #\i > (read-char ip) #\i > (peek-char ip) #!eof > (eof-object? (peek-char ip)) #t > (read-char ip) #!eof > (read-char ip) #!eof >
As an example, we will implement a scanner and parser for lists of numbers. The scanner accepts a character stream (an input port) according to the grammar, and returns token records:
(lparen)-- a left-parenthesis token
(rparen)-- a right-parenthesis token
(datum number)-- a datum token. Note that number should be a normal, everyday Scheme number (which will display in decimal form, though that's Scheme's doing, not yours).
(eof)-- an end-of-file token (returned when there is nothing to scan).
Scanning and parsing Scheme is a bit of an extension of the list of numbers scanner and parser. The scanner is quite a bit more complicated: it accepts strings based on a slightly simplified Scheme grammar. The parser for Scheme is almost as easy as that for lists of numbers, however. It is based on Scheme's datum grammar, and should return a Scheme datum.