MiniScheme interpreter

This page describes the MiniScheme interpreter you will implement over several weeks, starting with homework 5. Each part builds on the previous parts so make sure you keep up with the work or you’ll become very far behind.

Additional parts will be added as new assignments are released.

Part 1. Environments

We need to create an environment to hold the data for the expressions we will interpret. The scoping rules for Scheme determine the structure of this environment. Consider the following three examples. First

(let ([x 2] [y 3])
  (+ x y))

This has an environment with bindings for x and y created in a let-block. These bindings are used in the body of the let.

Next, consider the following. This has a let-block that creates a lambda-expression, which is called in the body of the let:

(let ([f (lambda (x) (+ x 4))])
  (f 5))

When this is evaluated we want to bind x to the value of the argument, 5, and then evaluate the body of f using that binding.

Finally, we combine these. At the outer level in the following expression we have a let-block with bindings for x and y. The body is another, nested, let, which binds a lambda expression with a parameter x. The body of the interior let is a call to the function.

(let ([x 2] [y 3])
  (let ([f (lambda(x) (+ x y))])
    (f 5)))

When we evaluate this we first make an environment with bindings of x and y to 2 and 3 respectively, then we use this to evaluate the inner let-expression. In that expression we make a binding of f to the value of the lambda-expression (a closure, of course), and then we call f with argument 5. This requires us to evaluate the body of f with x bound to 5. The body of f does not have a binding for y, so we look it up in the outer environment and see that its value is 3. Finally, we evaluates (+ x y) with x bound to 5 and y bound to 3, yielding 8 for the value of the full expression.

Environments are extended in two ways. Let-expressions have bindings that extend the current environment; the body of the let is evaluated in the extended environment. Lambda-expressions do not extend the environment; they evaluate to closures that store the current environment from the location where the (lambda (...) ...) is evaluated.

When the closure that results from a lambda-expression is called, the closure’s environment is extended with the parameters from the lambda-expression being bound to the values of the arguments.

We will define the environment as an association list, where symbols are associated with values. There are two ways we might do this. In the first example above, where x is bound to 2 and y to 3, we might use the list '((x 2) (y 3)), or we might use '((x y) (2 3)). The former structure is closer to the way the bindings appear in let-expressions; the latter is closer to the components of a call. The former structure might appear simpler, but the latter is actually easier to code and we will go with that.

Scheme, and many other languages you have and likely will learn, employs lexical scoping. When we want to resolve the binding for a free variable, we look first in the current scope, then in the surrounding scope, then in the scope that surrounds that, until the variable is found or we reach the outermost scope. To implement this our environments will be structured as a list with four elements: the symbol 'env, a list of symbols, a list of values, and the previous environment. The top-most environment does not have a previous-environment (by definition) so we’ll use null to represent the empty environment.

Thus, the environment for the expression

(let ([x 2] [y 3])
  (let ([z 4] [x 5])
    (+ x (+ y z))))

will be something like

'(env (z x) (4 5)
      (env (x y) (2 3)
           ()))

When we resolve the bindings for x, y and z to evaluate (+ x (+ y z)) we find the binding 5 for x (there are two bindings for x, but the one we want is is the first one we come to), and of course we find 4 for z and 3 for y. This leads to the correct value, 12 for the expression.

Similarly, in the expression

(let ([x 2] [y 3])
  (let ([f (lambda(x) (+ x y))])
    (f 5)))

we evaluate the call (f 5) by evaluating the body of f in an environment that first has x bound to 5, and then has the environment surrounding the definition of f.

You will see in later parts how this environment is created. At present we need to create the tools that will allow this.

Part 2. The environment data type

The two most important features of an environment are that we need to be able to look up symbols to get the values they are bound to, and we need to be able to extend an environment with new bindings to get a new environment. We’ll define an environment as either the empty environment, with no bindings, or an extended environment with a list of symbols, a corresponding list of the values those symbols are bound to, and a previous environment that is being extended. Here is the definition of the empty environment and a constructor for an extended environment.

; The empty environment is null.
(define empty-env null)
(define empty-env? null?)

(struct env (syms vals previous) #:transparent)

Note that env is the procedure you will use every time you need to extend an environment when evaluating a let-expression or a function call. For example, when evaluating the expression (let ([x 1] [y 2]) ...) we might use the following.

(define env-a
  (env '(x y) '(1 2) empty-env))

We can further extend this environment.

(define env-b
  (env '(x z) '(5 7) env-a))

There are two recognizers.

(env? x)
(empty-env? x)

The first returns #t if x is an extended environment and the second returns #t if x is an empty environment.

The accessor functions for the different fields of an extended environment are provided by struct.

(env-syms e)
(env-vals e)
(env-previous e)

All that remains is to build some helper functions to look up bindings in an environment.

Part 3. Implementing the environment lookup

The file env.rkt contains the code given above for the environment data type. Add to this file function (env-lookup environment symbol), which takes an environment and a symbol and returns the first binding for that symbol in the environment. For example, with environments env-a and env-b defined as

(define env-a
  (env '(x y) '(1 2) empty-env))
(define env-b
  (env '(x z) '(5 7) env-a))

we should have the following behavior:

If env-lookup does not find a binding for the symbol you should invoke the error procedure (error sym string), as in

(error 'env-lookup "No binding for ~s" sym)

Make sure you provide env-lookup at the top of env.rkt so that modules you’ll write in future parts (and in the tests described below) can use that procedure.

The file env-tests.rkt contains a test-env environment that maps x to 1 and y to 2. In addition it defines an env-tests test suite that tests the basic behavior of the environment data type. Extend this test suite with additional tests for env-lookup. You will probably want to define new extended environments for your tests.

In particular, be sure to test at least the following situations

This is the end of homework 5. The other parts will be completed in homeworks 6 and 7.

Part 4: MiniScheme version A

We will start with a very basic version of our MiniScheme language, version A, and gradually add language features. As we do so, we will update our parser and interpreter to implement the new language features.

As a convention, the parser and interpreter for MiniScheme will reside in files parse.rkt and interp.rkt, respectively. After you complete each of the remaining parts, you should be sure to commit your code. That way, you can always retrieve an earlier version if you need to.

In addition to parse.rkt and interp.rkt, you will use env.rkt from Part 3 so make sure your env-lookup procedure works!

Here’s what you’ll have in each file.

As you go, think about where to put new definitions. Racket has trouble with circular requirements (such as parse.rkt requiring interp.rkt and interp.rkt requiring parse.rkt). You probably want your parse.rkt and env.rkt files to not require any other module, and your interp.rkt file to require both parse.rkt and env.rkt.

In addition to those files, we’ll have some testing files.

Version A of MiniScheme is given by the following grammar.

EXP → number                      parse into lit-exp

Our parse procedure will take an input expression and return a parse tree for an EXP. The only expressions in MiniScheme A are numbers. Admittedly, this is not very exciting but it is a place to start. Our interpreter for MiniScheme A will be very basic as well.

In parse.rkt we need a data type to hold EXP nodes that represent numbers. An easy solution is to make a list out of an atom (such as ‘lit-exp) that identifies the data type and the numeric value being represented. There are other possible representations and it doesn’t matter which you choose as long as you have a constructor (which I’ll call lit-exp), recognizer lit-exp? and getter lit-exp-num. You can use any names you want; the only required names are parse (for the parser), eval-exp (for the interpreter) and init-env (for the initial environment).

The parse function simply creates a lit-exp when it sees a number (and throws an error otherwise). It looks like this

(define (parse input)
  (cond [(number? input) (lit-exp input)]
        [else (error 'parse "Invalid syntax ~s" input)]))

Save this code and the code that implements the lit-exp data type, in parse.rkt. Make this file into a module by adding the appropriate provide lines to make the parse procedure and the procedures for lit-exp (or whatever you call it) available to other modules.

Before moving on to the interpreter, copy your code for env.rkt and env-tests.rkt to your new repository. Remove the line (run-tests env-tests) from env-tests.rkt since the env-tests test suite will be run by tests.rkt from now on.

In parse-tests.rkt, add tests for parse. In particular, you should test that (parse 5) returns something that lit-exp? evaluates to true. (You may wish to use the test-pred procedure.)

(test-pred "Literal"
           lit-exp?
           (parse 5))

Test that when you parse a number, you can extract the number from the resultant parse tree using lit-exp-num.

For the interpreter, you know that Scheme evaluates all integers as themselves. Our evaluation function will be very simple. It looks like this.

(define (eval-exp tree e)
  (cond [(lit-exp? tree) (lit-exp-num tree)]
        [else (error 'eval-exp "Invalid tree: ~s" tree)]))

Save this code as interp.rkt. Make sure you require parse.rkt and (for the next step) env.rkt. Similarly, make sure you provide eval-exp for use by other modules.

We can interpret expressions in the command interpreter of the interp.rkt file. Run this file in DrRacket to load the interpreter and parser into memory, then type into the command interpreter:

> (define T (parse '23))
> (eval-exp T empty-env)

This should print the value 23.

It quickly becomes tedious to always invoke your interpreter by specifically calling the interpreter eval-exp after calling the parser on the quoted expression. It would be nice if we could write a read-eval-print loop for MiniScheme. This is precisely what minischeme.rkt does.

Running this program in DrRacket will give you an input box that allows you to type expressions and get back their value as determined by your parse and interp modules. . For example, if you enter the MiniScheme expression 23 this evaluates it and prints its value 23.

The read-eval-print procedure assumes that your parse procedure is named parse; that your evaluator is called eval-exp that takes as arguments a parse tree and an environment, in that order; and an initial environment named init-env.

Inside interp.rkt, define and provide an init-env. For the init-env, use your env constructor to create an environment mapping x to 23 and y to 42`. For MiniScheme A, these mappings are of no use, but they will be shortly.

The last thing we need to do for MiniScheme A is to write some tests. In interp-tests.rkt, write a test like

(test-eqv? "Number"
             (eval-exp (lit-exp 5) empty-env)
             5)

Note that we’re explicitly passing the parse tree (lit-exp 5) rather than calling (parse 5). This lets you separately test parsing from evaluating. In subsequent parts, you’ll be modifying parse. You don’t want a bug in parse to show up as a bug in eval-exp. By not using parse in your interpreter tests, you can keep the two separate.

Congratulations, at this point you should have a working interpreter for MiniScheme A! Make sure you commit your work at this point.

Part 5: MiniScheme version B, variables and definitions

MiniScheme A is somewhat lacking in utility. Our specification for MiniScheme B will be only slightly more interesting.

We will start with the following grammar for MiniScheme B.

EXP → number                      parse into lit-exp
    | symbol                      parse into var-exp

The parser is a simple modification to parse. You need to add a line to (parse input) to handle the case where (symbol? input) is #t. Of course, you need a var-exp data type including a constructor (I call it var-exp), recognizer (var-exp?) and getter (var-exp-symbol).

To evaluate a variable expression, MiniScheme B needs to be able to look up references. We evaluate a var-exp tree node in a given environment by calling lookup in that environment on the var-exp-symbol. Since we asked you to include bindings for symbols x and y in the initial environment in Part 4, you should be able to evaluate the MiniScheme expressions x or y to get their values. Any other symbol at this point should give you an error message.

Make sure you commit your work at this point.

Part 6: MiniScheme version C, calls to primitive procedures

This is a good point to add primitive arithmetic operators to our environment. Nothing needs to be done for parsing–operators like +, - and so forth are symbols, so they will be parsed to var-exp nodes. Our environment needs to associate these symbols to values. There are many ways to do this; the way we will use will be easy to expand to procedures derived from lambda expressions. We will first make a data type prim-proc to represent primitive procedures. This is simple; the only data this type needs to carry is the symbol for the operator, so this looks just like the var-exp type. Make a constructor, a recognizer, and a getter for the data type.

Think about which file should contain this data type definition. Keep in mind that nothing in the parser needed to change to support these primitive procedures.

Next, we make a list of the primitive arithmetic operators. You can start with the following and later expand it.

(define primitive-operators '(+ - * /)

We can define a primitive operator environment and make our init-env extend that instead of the empty-env.

(define prim-env
  (env primitave-operators
       (map prim-proc primitave-operators)
       empty-env))

(define init-env
  (env '(x y)
       '(23 42)
       prim-env))

This means that when we evaluate + by looking it up in the environment we will get the structure '(prim-proc +)

We will now extend the grammar to include applications so we can use our primitive operators.

EXP → number                      parse into lit-exp
    | symbol                      parse into var-exp
    | (EXP EXP*)                  parse into app-exp

This gives us a language that can do something. You need to implement an app-exp data type that can hold a procedure (which is itself a tree) and a list of argument expressions (again, these are trees). The constructor for that might be (app-exp proc args). Update the parser to build an app-exp node when the expression being parsed is a list.

We now have a parse procedure that looks like this.

(define (parse input)
  (cond [(number? input) ...]
        [(symbol? input) ...]
        [(list? input)
         (cond [(empty? input) (error 'parse "Invalid syntax ~s" input)]
               [else (app-exp (parse (first input)) ...)])]
        [else (error 'parse "Invalid syntax ~s" input)]))

Remember to parse both the operator and the list of operands.

One thing to notice about this is that we have duplicated the (error ...) line. We can simplify our code slightly by extracting that out into a 0-argument parse-error procedure.

(define (parse input)
  (letrec ([parse-error (λ () (error 'parse "Invalid syntax ~s" input))])
    (cond [(number? input) ...]
          [(symbol? input) ...]
          [(list? input)
           (cond [(empty? input) (parse-error)]
                 [else (app-exp (parse (first input)) ...)])]
          [else (parse-error)])))

Add tests to parse-tests to test that applications are parsed correctly. Make sure you test parsing applications with different numbers of parameters (foo), (bar 1), (baz x y), etc. Add a test that '() causes an error. Look at Part 3 for how to write a test for an error.

In the interp module we extend eval-exp to evaluate an app-exp node by calling a new procedure apply-proc with the evaluated operator and the list of evaluated arguments. Here is apply-proc.

(define (apply-proc proc args)
  (cond [(prim-proc? proc)
         (apply-primitive-op (prim-proc-op proc) args)]
        [else (error 'apply-proc "bad procedure: ~s" proc)]))

The apply-primitive-op procedure takes a symbol corresponding to the primitive procedure and a list of argument values. Here is one possible apply-primitive-op.

(define (apply-primitive-op op args)
  (cond [(eq? op '+) (apply + args)]
        [(eq? op '-) (apply - args)]
        [(eq? op '*) (apply * args)]
        [(eq? op '/) (apply / args)]
        [else (error 'apply-primitive-op "Unknown primitive: ~s" op)]))

Our language now handles calls to primitive operators, such as (+ 2 4) or (+ x y). We are getting somewhere!

Next extend MiniScheme C to support three new primitive procedures that each take one argument: add1, sub1, and negate. The first two should be obvious; the negate procedure negates its argument: (negate 6) is -6, and (negate (negate 6)) is 6.

What kind of Scheme doesn’t have list processing functions? Extend MiniScheme C to implement list, cons, car, and cdr. The initial environment should also include a new variable, 'null bound to the empty list.

Add tests to interp-tests to test evaluating some of your primitive procedures. Make sure you specify your tests in terms of app-exp, var-exp, and lit-exp.

Our methodology should now be pretty clear. At each step we have a new line in the grammar to handle a new kind of Scheme expression. We update the parser, which requires making a new tree data type to handle the new parsed expression. We then update the eval-exp procedure to evaluate the new tree node. For the remaining steps we will be more brief.

Make sure you commit your code at this point.

Part 7: MiniScheme D, Conditionals

Let’s update our language to include conditional evaluation. We will adopt the convention that 0 and False represent false, and everything else represents true. Note that #t and #f are not values in MiniScheme. You should assign the value 'True and 'False to the symbols 'True and 'False. True expressions, such as (eqv? 2 (+ 1 1)) should evaluate to True, not to #t.

Write MiniScheme D, which implements if-expressions. You will need to add False and True to the initial environment as described above. The meaning of (if foo bar baz) is just what you’d expect: If foo evaluates to False or 0, then the value of the if-then-else expression is obtained by evaluating baz; otherwise, the value is obtained by evaluating bar.

This is the grammar for MiniScheme version D.

EXP → number                      parse into lit-exp
    | symbol                      parse into var-exp
    | (if EXP EXP EXP)            parse into ite-exp
    | (EXP EXP*)                  parse into app-exp

You need to make a new data type and update the parser in parse.rkt, and update the eval-exp procedure in interp.rkt. For the parser, note that both if expressions and application expressions are lists. We know a list represents an if-expression if its first element is the symbol ‘if. Put the test for this in the inner cond after the test for an empty list. We will assume a list represents an application expression if we don’t recognize its first element as a special form. So far, if is our only special form. Later parts will have more special forms.

Finally, extend MiniScheme D to implement the primitives eqv?, lt?, gt?, leq? and geq? where eqv? behaves like Scheme’s eqv? and lt?, gt?, leq?, and geq? behave like the usual inequality operators <, >, <=, and >=. Each of these should return 'True or 'False and not #t or #f.

Add primitive procedures null?, list?, and number? which behave like their Scheme counterparts, but return True or False rather than #t or #f.

Add parser and interpreter tests.

Make sure you commit your code at this point.

Part 8: MiniScheme E, Let expressions

The grammar for MiniScheme E is

EXP → number                      parse into lit-exp
    | symbol                      parse into var-exp
    | (if EXP EXP EXP)            parse into ite-exp
    | (let (LET-BINDINGS) EXP)    parse into let-exp
    | (EXP EXP*)                  parse into app-exp
LET-BINDINGS → LET-BINDING*
LET-BINDING → [symbol EXP]

As you can see, we have added new clause for the let expression. To make eval-exp clearer, I suggest that you make a let-exp data type that contains three children:

  1. A list of the symbols that are bound in the binding list
  2. A list of the parsed expressions (i.e., trees) that the symbols are bound to
  3. The let body.

Thus, although we have grammar symbols for LET-BINDING and LET-BINDINGS, we choose to build the tree slightly differently.

After the parser is extended to handle let expressions, we extend eval-exp to handle the let-exp nodes created by the parser. This should be straightforward–we evaluate a let-exp node in an environment by extending the environment with the let symbols bound to the values of the let bindings (map a curried version of eval-exp onto the binding expressions), and then evaluate the let body within this extended environment.

When you are finished you should be able to evaluate expressions such as

(let ([a 1]
      [b 5])
  (+ a b))

and

(let ([a (* 2 3)]
      [b 24])
  (let ([c (- b a)])
    (* c (+ a b))))

Add parser and interpreter tests.

Make sure you commit your code at this point.

This is the end of homework 6. The remainder will be completed in homework 7.

Part 9: MiniScheme F, Lambda expressions and closures

No language would be complete without the ability to create new procedures. Our new version of MiniScheme will implement lambda expressions. A lambda expression should evaluate to a structure containing the formal parameters, the body, and the environment that was current when the procedure was created (i.e., when the lambda expression was evaluated. This structure is known as a closure. You should start by making a data type for closures that holds three parts: parameters, body, and environment.

We parse a lambda expression such as

(lambda (x y) (+ x y))

into a lambda-exp tree node.

This is a new kind of tree node with two parts: the parameter list and the tree that results from parsing the body. The parse function doesn’t track the environment, so it can’t build a full closure. Parsing a lambda expression just gives a tree; it is when we evaluate that tree that we get a closure. If exp is the tree we get from parsing such a lambda expression, (eval-exp exp e) builds a closure with exp’s parameter list and body combined with the environment e.

We are ready for MiniScheme F. The syntax is extended once more, this time to include lambda expressions. Here is the grammar for MiniSchemeF.

EXP → number                      parse into lit-exp
    | symbol                      parse into var-exp
    | (if EXP EXP EXP)            parse into ite-exp
    | (let (LET-BINDINGS) EXP)    parse into let-exp
    | (lambda (PARAMS) EXP)       parse into lambda-exp
    | (EXP EXP*)                  parse into app-exp
LET-BINDINGS → LET-BINDING*
LET-BINDING → [symbol EXP]
PARAMS → symbol*

We parse a lambda expression into a lambda-exp node that stores the parameter list and the parsed body. In eval-exp, we evaluate a lambda-exp node as a closure that has the lambda-exp’s parameter list and parsed body and also the current environment.

In MiniScheme C, we defined apply-proc like this.

(define (apply-proc proc args)
  (cond [(prim-proc? proc)
         (apply-primitive-op (prim-proc-op proc) args)]
        [else (error 'apply-proc "bad procedure: ~s" proc)]))

We now extend this with a case for proc being a closure. To evaluate the application of a closure to some argument values we start with the environment from the closure, extend that environment with bindings of the parameters to args, and call eval-exp on the closure’s body with this extended environment. We have already written procedures to handle each of these steps; it is just a matter of calling them. After implementing this you should be able to do the following.

MS> ((lambda (x) x) 1) 
1 
MS> ((lambda (x y) (* x y)) 2 4) 
8 
MS> (let ((sqr (lambda (x) (* x x)))) (sqr 64)) 
4096 
MS> (let ((sqr (lambda (x) (* x x)))) (let ((cube (lambda (x) (* x (sqr x))))) (cube 3))) 
27

Add parser and interpreter tests.

Make sure you commit your code at this point.

Part 10: MiniScheme G, Variables, assignments, and sequencing

Our next feature will be variable assignment, with set!. Unfortunately, our implementation of environments does not provide a way to change the value bound to a variable. We will modify our old implementation so that variable names are bound to a mutable data type called a box, which is provided by DrRacket.

Take a moment to familiarize yourself with boxes in Racket:

> (define abox (box 17)) 
> (box? abox) 
#t 
> (unbox abox) 
17
> (set-box! abox 32) 
> (unbox abox) 
32

When variables are created (i.e., when we extend an environment) we will bind them to boxes. When they are referenced we will unbox their bindings. We will take these tasks sequentially.

First, when eval-exp currently evaluates a var-exp, it gets the symbol from the expression and looks it up in the environment. When our variables all refer to boxes, eval-exp needs to do an extra step: It gets the symbol from the expression, looks it up in the environment, and unboxes the result.

Secondly, whenever the environment is extended, the new bindings will be boxes that contain values. This occurs in two places. One is when we evaluate a let-expression in eval-exp, the other is when we apply a closure in apply-proc. For the latter our code used to be a recursive call to eval-exp on the body from the closure, using the environment (env params vals e). After we introduce boxes we will still do this with a recursive call to eval-exp on the body only now we need to box the argument values as we extend the environment. We handle let-expressions in the same way.

There are two ways to implement this—you can either change the calls to env to map box onto the values, or change the code for env itself to always box values when it puts them in a new environment. Take your pick; one approach is as easy as the other.

At this point your interpreter should be running exactly as it did for MiniScheme F: let-expressions, lambda expressions and applications should all work correctly. Make sure this is the case before you proceed. We will now take advantage of our boxed bindings to implement set!.

MiniScheme G will implement variable assignment in the form of set! expressions. Note that we will not be implementing set! as a primitive function, but as a special form. When evaluating (set! x 5) we don’t want to evaluate variable x to its previous value, as a call would, but rather to store value 5 in its box.

The grammar for MiniScheme G is as follows.

EXP → number                      parse into lit-exp
    | symbol                      parse into var-exp
    | (if EXP EXP EXP)            parse into ite-exp
    | (let (LET-BINDINGS) EXP)    parse into let-exp
    | (lambda (PARAMS) EXP)       parse into lambda-exp
    | (set! symbol EXP)           parse into set-exp
    | (begin EXP*)                parse into begin-exp
    | (EXP EXP*)                  parse into app-exp
LET-BINDINGS → LET-BINDING*
LET-BINDING → [symbol EXP]
PARAMS → symbol*

Let’s start with set!.

We need to extend eval-exp to handle set-exp tree nodes. This is just a matter of putting all of the pieces together: we lookup the symbol from the expression (the variable being assigned to) in the current environment; this should give us a box. We call set-box! on this box with the value we get from recursively calling eval-exp on the expression part of the set-exp.

Here is what we can do when this is implemented.

MS> (set! + -) 
# 
MS> (+ 2 2) 
0 
MS> (set! + (lambda (x y) (- x (negate y)))) 
# 
MS> (+ 2 2) 
4 
MS> (+ 2 5) 
7 
MS> exit 
returning to Scheme proper

Now that we have introduced side effects, it seems a natural next step to implement sequencing of expressions which we will do via begin. The grammar for MiniScheme G contains the following rule.

EXP → (begin EXP*)

Evaluating (begin e1 e2 ... en) results in the evaluation of e1, e2, …, en in that order. The returned result is the last expression, en.

A begin-exp holds a list of parsed expression. You will need to think about how to add begin-exp to your eval-exp procedures. You need to iterate through the list of expressions in such a way that

(let ([x 1] [y 2])
  (begin (set! x 23)
         (+ x y))) 

returns 25; the whole point of begin is that the subexpressions might have side effects that alter the environment. Perhaps this will encourage you to be more appreciative of functional programming.

Make sure to add parser and interpreter tests as you go.

As usual, it’s a good idea to commit your code at this point.

Part 11: MiniScheme H, Recursion

It looks like we’re about done, but let’s take a closer look. What happens if we try to define a recursive procedure in MiniScheme G? Let’s try the ever-familiar factorial function.

MS> (let ([fac (lambda (n)
                 (if (eqv? n 0)
                     1
                     (* n (fac (- n 1)))))])
      (fac 4)) 

This gives an error message saying there is no binding for fac. But we bound fac using let. Why is MiniScheme reporting that fac is unbound? The problem is in the recursive call to fac in (* n (fac (- n 1))). When we evaluated the lambda expression to create the closure, we did so in an environment in which fac was not bound. Because procedures use static environments when they are executed, the recursive call failed. The same thing would happen in Scheme itself; this is why we have letrec.

Recall what happens when a procedure is created. A closure is created that contains the environment at the time the lambda was evaluated, along with the body of the function and the formal parameters. MiniScheme F has no problems with this.

When a procedure is called the free variables in the body are looked up in the environment that was present at the time the lambda was evaluated. This is where MiniScheme ran into problems with the factorial example: the variable fac in the line

(* n (fac (- n 1))))))

was not bound to anything at the time the procedure was created, and so we got an error.

There is a clever way to get around this problem. Try running the following code:

MS> (let ([fac 0])
      (let ([f (lambda (n)
                 (if (eqv? n 0)
                     1
                     (* n (fac (- n 1)))))])
        (begin
          (set! fac f)
          (fac 4))))

This works correctly. You can use this pattern for all recursive functions.

So then, it appears that recursive procedures are really “syntactic sugar”; we will rewrite letrec-expressions as let-expressions inside let-expressions with set!s to tie everything together. Here is the grammar for our final language, MiniScheme H.

EXP → number                      parse into lit-exp
    | symbol                      parse into var-exp
    | (if EXP EXP EXP)            parse into ite-exp
    | (let (LET-BINDINGS) EXP)    parse into let-exp
    | (lambda (PARAMS) EXP)       parse into lambda-exp
    | (set! symbol EXP)           parse into set-exp
    | (begin EXP*)                parse into begin-exp
    | (letrec (LET-BINDINGS) EXP) translate to equivalent let-exp
    | (EXP EXP*)                  parse into app-exp
LET-BINDINGS → LET-BINDING*
LET-BINDING → [symbol EXP]
PARAMS → symbol*

The way we are handling letrecs is what is known as a syntactic transformation. When the parser sees a letrec expression it can either produce an equivalent let expression and parse that, or it can directly create the appropriate let-exp tree. The latter is what I do, but either approach works.

To implement MiniScheme H you should only have to modify the parser. You may want to use a helper function parse-letrec to do the work so you don’t clutter your parser. You may want to make helper functions for each of the special forms, although you certainly do not have to.

In the factorial example above, we first bound fac to 0. Then, in an inner let-expression, we defined a new, placeholder variable f to the standard, recursive implementation of factorial. Finally, we used begin and set! to set the value of fac to our placeholder variable f.

When creating the placeholder variables, we don’t want to shadow existing bindings so we need a way to create some fresh symbols to use as the placeholders. The procedure gensym always returns fresh, unused symbols that we can use for this purpose.

> (gensym)
g62

When you have this completed the following examples should work.

MS> (letrec ([fac (lambda (x) (if (eqv? x 0) 1 (* x (fac (sub1 x)))))]) (fac 4))
        24
MS> (letrec ([fac (lambda (x) (if (eqv? x 0) 1 (* x (fac (sub1 x)))))]) (fac 10))
        3628800
MS> (letrec ([even? (lambda (n) (if (eqv? 0 n) True (odd? (sub1 n))))]  
             [odd? (lambda (n) (if (eqv? 0 n) False (even? (sub1 n))))] )
   (even? 5))
False

Make sure you have written tests for the parser.

Congratulations! You have reached the end of Homework 7 and should have a working MiniScheme that supports recursion. This is quite a feat!

Make sure you’ve committed and pushed your code to GitHub.