Scanning and Parsing

For this workshop, we will be writing both a scanner and a parser directly in Scheme rather than using scanner/parser-generators such as lex or yacc to do the ``dirty'' work.

If we are to write our own scanner, we need to deal with a sticky issue: input. A scanner should take as its input a ``stream'' of characters. This could be simulated by passing in a string, we will instead use Scheme's input port mechanism.

We will use the following procedures for our input:

Here is an example of their use:
> (define ip (open-input-string "hi"))
> (peek-char ip)
#\h
> (peek-char ip)
#\h
> (read-char ip)
#\h
> (peek-char ip)
#\i
> (read-char ip)
#\i
> (peek-char ip)
#!eof
> (eof-object? (peek-char ip))
#t
> (read-char ip)
#!eof
> (read-char ip)
#!eof
> 

Scanning and Parsing Lists of Numbers

As an example, we will implement a scanner and parser for lists of numbers. The scanner accepts a character stream (an input port) according to the grammar, and returns token records:

The parser accepts an input port and call the scanner on that port to get token records. Both are defined in scan-numlist.ss

Scanning and Parsing Scheme

Scanning and parsing Scheme is a bit of an extension of the list of numbers scanner and parser. The scanner is quite a bit more complicated: it accepts strings based on a slightly simplified Scheme grammar. The parser for Scheme is almost as easy as that for lists of numbers, however. It is based on Scheme's datum grammar, and should return a Scheme datum.


ehilsdal@cs.indiana.edu