C311 script7.txt -- 10/14/96 --- RULES FOR CPS. --- To develop a general technique for CPS conversion, we focus on a specific language (which we will see easily generalizes to many, but not all, other languages). 1. variables and literals 2. procedural expressions 3. primitive applications not implemented using procedure calls don't invoke procedures passed as arguments not first-class (lambda (x y) (+ x y)) (define-record prim-app (primitive rands)) 4. special forms such that all subexps evaluate in arbitrary order something is done after they are evaluated e.g. applications, set! 5. special forms s.t. first those subsexps in certain positions are evaluated in arbitrary order (head position) then exactly one of the remaining subexps is evaluated and its value returned immediately (tail position) tail position expressions may be w/i scope of variables, called the binding positions of the forms --- For each form with subexpressions, we indicate whether the subexpression is in head (H), tail (T), or niether position w.r.t. the form. (The significance of head position will become clear in a bit.) (lambda (v...v) E) (prim-op H ... H) (H ... H) (set! v H) (if H T T) (let ((b H) ... (b H)) T) (letrec ((b H) ... (b H)) T) (case H ((x...x) T) ... ((x...) T)) (v-c H (x (b...b) T) ... (x (b...b) T)) -- special problem --- We say an expression is SIMPLE if its evaluation can't possibly be responsible for recursive control behavior because it does not contain applications. (We shall see later that a less restrictive characterization is possible.) A TAIL-FORM expression is one in which every subexpression in nontail position is simple. Tail form is a static characterization of when iterative control behavior is guaranteed. +++ Examples: (car x)&simple&tail-form (if p x (car (cdr x))) simple tail form (f (car x)) not simple tail form (car (f x)) not simple not tail form (if p x (f (cdr x))) not simple tail form (if (f x) x (f (cdr x))) not simple not tail form (lambda (x) (f x)) simple tail form (lambda (x) (car (f x))) simple not tail form --- If an expression is in tail form, and any procedures accessible through variable bindings are also in tail form, then the expression will execute with iterative control behavior in a properly tail-recursive implementation.} A procedure is in FIRST-ORDER FORM if it contains no lambda subexpressions and if all operator expressions are variables referring to top-level bindings. FACT-ITER-ACC, etc, are in first-order form, but (lambda (x) (x 3)) and (lambda () (lambda (x) x)) are not GOAL: translate any program into first-order form. --- Now for a systematic approach to CPS, in two stages. Stage 1: (lambda (x_1 ... x_n) E) => (lambda (x_1 ... x_n k) _(k E)_) where (k E) is unprocessed, indicated by _(...)_ +++ Stage 2: Four transformation rules applied to an unprocessed expression of the form _(k E)_ (there may be more than one--pick one). +++ C_simple: _(k E)_ => (k E), if E is simple. (k (lambda (x y k) (k (g (- n 1))))) must still process inner k +++ C_eta: _(k (E_1 ... E_n))_ => (E_1 ... E_n k), where E_1, ..., E_n are simple. There may still be lambda subdexpressions to process, e.g. _(k (f (lambda (k) _(k (g x))_)))_ => (f (lambda (k) _(k (g x))_)) => ... --- Consider next type of transformation: (k (+ (f x) (g y))) = (f x (lambda (v) (k (+ v (g y))))) = (f x (lambda (v) (g y (lambda (w) (k (+ v w)))))) +++ We find an INITIAL EXPRESSION I of E, i.e. an innermost nonsimple expression that can be evaluated first (is in head position wrt E). E.g. in (k (* n (fact (- n 1)))), I = (fact (- n 1)) since (- n 1) is simple in (f (g x) (h y)), I = (g x) or (h y), but not (f ...). --- Excercise: put a box around the innermost expression: (f (g x) y) (f x y) (f (+ x y) z) (if (> x y) (f (g x)) (h (j x))) (if (f 3) (g 4) (g 5)) (let ((y 4)) (f 4)) (let ((x (if (> x y) (g x) y)) (y (f 5))) (x y)) --- (define initial-exp (lambda (exp) (letrec ((loop (lambda (ls) (cond ((null? ls) exp) ((simple? (car ls)) (loop (cdr ls))) (else (initial-exp (car ls))))))) (loop (head-exps exp))))) --- C_app: _(k E)_ => (E1 .... En (lambda (v) _(k E{v/I)_)) where (E1 ... En) is an initial exp of (k E) and v is a previously unused variable. -- Picture on board with context around I -- E{v/I is the positional substitution of v for I in E (capture does not occur). In the program use eq? (by assq). --- Finally, consider (k (if (null? ls) 0 (+ (length (cdr ls)) 1))) => (if (null? ls) (k 0) (k (+ (length (cdr ls)) 1))) +++ More complicated (k (* (if (null? ls) x (+ (length (cdr ls)) 1)) 3)) = (if (null? ls) (k (* x 3)) (k (* (+ (length (cdr ls)) 1) 3))) --- A complication (k (h (let ((g 3)) (f g x)))) (f, not + in book) => (let ((g 3)) (k (h (f g x)))) But (k (g (let ((g 3)) (f g x)))) (f, not + in book) /=> (let ((g 3)) (k (g (f g x)))) +++ Solution: alpha-convert (k (g (let ((g 3)) (f g x)))) (f, not + again) = (k (g (let ((g: 3)) (f g: x)))) by alpha-conversion = (let ((g: 3)) (k (g (f g: x)))) ... = (let ((g: 3)) (f g: x (lambda (v) (g v k)))) --- Write E for alpha-conv E to rename x1,..,xp (we add a colon to the name) Picture with (k ... (if H T1 T2) ...) => (if H (k ... T1 ...) (k ... T2 ...)) +++ C_special: _(k E)_ => I{_(k E{T_1/I)_/T_1,...,_(k E{T_n/I)_/T_n where I is an initial expression of E with T_1,...,T_n (n>0) subexpressions in tail position and the binding variables of I that occur free in E are b_1,...,b_n. --- CALL CHAINS -- more general definition of proper tail-recursion, -- clarify connection between proper tail recursion and iterative control behavior -- less restrictive characterization of simple expressions so tail form is less restrictive and CPS algorithm is improved +++ (+ (fact-iter n) m) => (fact-iter n (lambda (v) (k (+ v m)))) but fact-iter calls fact-iter-acc which is tail-recursive, so CPS alg can treat (+ (fact-iter n) m) as simple, so _(k (+ (fact-iter n) m))_ => (k (+ (fact-iter n) m)) --- Simple expressions have control behavior that isn't recursive. If an exp contains no available applications, then it's clearly simple. Less restrictive def of simple is also possible. +++ A CALL CHAIN is assoc with every call to a procedure: extend the current call chain when calling a procedure. (define f (lambda (x) (g x))) (define g (lambda (y) (f (h y)))) (define h (lambda (z) z)) Infinite call chain created by (f 3), but doesn't contain (h y). Tempting to think of as a stack pushed on call and poped on return, but proceudres with iterative control behavior only pop stack once when they return. --- DEFINITION: A properly tail-recursive implementation, or simply a tail-recursive implementation, is one in which any chain of tail calls results in bounded growth of control space. +++ Call chains may be infinite. A call chain is UNBOUNDED if it's infinite or length can't be known w/o run-time data. An unbounded call chain must contain an unbounded number of occurences of at least one application. +++ DEFINITION: An application is REPEATING if it MAY occur an unbounded number of times in some call chain. DEFINITION: An expression is SIMPLE if it is known that none of its available applications are repeating. E.g. (+ (fact-iter n) m) is simple, since the appl is not repeating. --- Can't always know if an exp is repeating, but using a CALL GRAPH it is often possible to show that an expr isn't repeating. If edge from A to B isn't in a cycle, it's appl can't be repeating. (letrec ((p1 (lambda (n) (list (p3 n)))) (p2 (lambda (n) (- n 1))) (p3 (lambda (n) (if (p4 n) 1 (* n (p5 n))))) (p4 (lambda (n) (zero? n))) (p5 (lambda (n) (p3 (p2 n))))) (p1 5)) --- END ---