The grammar form, obtained by loading the file grammar.ss, contains a context-free
grammar and returns an acceptor for the grammar. The grammar is expressed
in "parenthesized, prefix, extended BNF". The returned acceptor is a
predicate that takes a datum and returns a boolean indicating if the datum
can be generated by the given grammar. Expressing any Scheme-like syntax
is straightforward, and a hook for introducing non-terminals defined using
arbitrary Scheme code allows any grammar that defines a subset of Scheme's
datum syntax to be handled.
An extended-BNF grammar for the grammar form follows:
The <start> variable must be one of the
production variables, and indicates the starting non-terminal of the grammar.
alt indicates alternation,
seq
sequencing,
lst descent into and sequencing through a
sub-list,
star zero or more (Kleene's star),
plus one
or more,
opt optional (zero or one), and
dot a dotted pair, where the first element of
dot recognizes a list of the car elements of an improper list,
while the second element recognizes the trailing cdr element.
In predicate, the expression shall eveluate to a predicate
that is passed the car of the list being accepted and returns a boolean
indicating whether it should be accepted.
In cfa, the expression shall evaluate to a procedure that
takes the list being accepted and returns #f if it is to be
rejected, or a tail of the list if the corresponding prefix of the list is
accepted.
If the non-terminal of
(report-if-bad
datum non-terminal) Bad
datum: input" is printed with a maximum print
depth of two. Also, from the time when the first such message is generated
until the grammar expression returns, alts fail immediately.
This keeps other alt elements from being tried when it is known that the
acceptor fails. Control does return through dynamically enclosing
report-if-bad elements, allowing the location of the original
error to be traced.
An acceptor for the grammar form may be obtained with:
(grammar grammar-expression
(datum (predicate (lambda (x) #t)))
(expression datum)
(grammar-expression
(report-if-bad 'grammar-expression
(lst 'grammar start (plus production))))
(start variable)
(variable (predicate symbol?))
(production (report-if-bad 'production (lst variable (star element))))
(element (alt terminal non-terminal))
(terminal (lst 'quote datum))
(non-terminal
(report-if-bad 'non-terminal
(alt variable
(lst 'alt (star element))
(lst 'seq (star element))
(lst 'lst (star element))
(lst 'star element)
(lst 'plus element)
(lst 'opt element)
(lst 'dot element element)
(lst 'predicate expression)
(lst 'cfa expression)
(lst 'report-if-bad datum non-terminal)))))
For a more substantial example, study the
acceptor for the context-free grammar of
Scheme transliterated from the R4RS formal syntax. Of particular
interest are the way procedure calls are defined to obtain suitable
Bad expression messages, and the way dot is used
in defining <formals>.
Chris Haynes / chaynes@indiana.edu