Homework 8: MiniScheme F, G & H

Due: Friday, November 15 at 23:59
First Commit: Monday, November 11 at 23:59

I would encourage you to start and complete this assignment early! The content of MiniScheme is in scope for Exam 2, which is happening the same week (in fact the day before) this homework is due.

This assignment has 3 parts and concludes our MiniScheme implementation! Each part builds a new version of MiniScheme, adding functionality step by step. Each part requires testing. You’ll definitely want to add testing for each part as you’re working on it.

Your implementations will be in several files. The start of each file should be

#lang racket
; Your name(s) here.

Preliminaries

Click on the assignment link. A reminder that new teams are not allowed for Homework 8 and that I have copied over the teams from Homeworks 5/6. Please reach out to me if that will be an issue.

Submission

To submit your homework, you must commit and push to GitHub before the deadline.

Note that apart from a new HONORCODE.md, there are no starter files for Homework 8. This is because you will be directly building on your Homework 6 work!

Start by copying all of your .rkt files from Homework 6 into your repository and committing them. Upon submission, your repository should contain the following files:

parse.rkt
parse-tests.rkt
interp.rkt
interp-tests.rkt
env.rkt
env-tests.rkt
minischeme.rkt
test.rkt
HONORCODE.md

Any additional files you have added to your repository should be removed from the main branch. (You’re free to make other branches, if you desire, but make sure main contains the version of the code you want graded.)

Make sure you put your name (and your partner’s name if you’re working with one) as a comment at the top of each file.

Part F: Lambda expressions and closures

No language would be complete without the ability to create new procedures. Our new version of MiniScheme will implement lambda expressions. A lambda expression should evaluate to a structure containing the formal parameters, the body, and the environment that was current when the procedure was created (i.e., when the lambda expression was evaluated). This structure is known as a closure. You should start by making a data type for closures that holds three parts: parameters, body, and environment.

We parse a lambda expression such as

(lambda (x y) (+ x y))

into a lambda-exp tree node.

This is a new kind of tree node with two parts: the parameter list and the tree that results from parsing the body. The parse function doesn’t track the environment, so it can’t build a full closure. Parsing a lambda expression just gives a tree; it is when we evaluate that tree that we get a closure.

If exp is the tree we get from parsing such a lambda expression, (eval-exp exp e) builds a closure with exp’s parameter list and body combined with the environment e.

Here is the grammar for MiniScheme F:

EXP → number                      parse into lit-exp
    | symbol                      parse into var-exp
    | (if EXP EXP EXP)            parse into ite-exp
    | (let (LET-BINDINGS) EXP)    parse into let-exp
    | (lambda (PARAMS) EXP)       parse into lambda-exp
    | (EXP EXP*)                  parse into app-exp
LET-BINDINGS → LET-BINDING*
LET-BINDING → [symbol EXP]
PARAMS → symbol*

We parse a lambda expression into a lambda-exp node that stores the parameter list and the parsed body.

Evaluating

In eval-exp, we evaluate a lambda-exp node as a closure that has the lambda-exp’s parameter list and parsed body and also the current environment.

When we handled app-exp in eval-exp for MiniScheme C, we defined apply-proc as below:

(define (apply-proc proc args)
  (cond [(prim-proc? proc)
         (apply-primitive-op (prim-proc-op proc) args)]
        [else (error 'apply-proc "bad procedure: ~s" proc)]))

Now, extend the cond with a case for proc being a closure. To evaluate the application of a closure to some argument values we start with the environment from the closure, extend that environment with bindings of the parameters to args, and call eval-exp on the closure’s body with this extended environment. We have already written procedures to handle each of these steps; it is just a matter of calling them!

When you finish your implementation, you should be able to evaluate the following expressions in minischeme.rkt:

MS> ((lambda (x) x) 1) 
1 
MS> ((lambda (x y) (* x y)) 2 4) 
8 
MS> (let ([sqr (lambda (x) (* x x))]) (sqr 64)) 
4096 
MS> (let ([sqr (lambda (x) (* x x))]) (let ([cube (lambda (x) (* x (sqr x)))]) (cube 3))) 
27

As always, make sure to test your implementation! Specifically, add both parser and interpreter tests. Committing your code is also recommended at this stage.

Part G: Mutation & Sequencing

Overview

Our next feature will be variable assignment with set!. Unfortunately, our implementation of environments does not provide a way to change the value bound to a variable. We will modify our old implementation so that variable names are bound to a mutable data type called a box, which is provided by Racket.

Take a moment to familiarize yourself with boxes in Racket:

> (define abox (box 17)) 
> (box? abox) 
#t 
> (unbox abox) 
17
> (set-box! abox 32) 
> (unbox abox) 
32

When variables are created (i.e., when we extend an environment) we will bind them to boxes. When they are referenced we will unbox their bindings.

Updating our existing implementation

We need to thread the mutability idea throughout our implementation so that it works for all variables (whether or not we ultimately reassign them!).

First, when eval-exp currently evaluates a var-exp, it gets the symbol from the expression and looks it up in the environment. When our variables all refer to boxes, eval-exp needs to do an extra step: It gets the symbol from the expression, looks it up in the environment, and unboxes the result.

Secondly, whenever the environment is extended, the new bindings will be boxes that contain values. This occurs in two places. One is when we evaluate a let-expression in eval-exp, the other is when we apply a closure in apply-proc. For the latter our code used to be a recursive call to eval-exp on the body from the closure, using the environment (env params vals e). After we introduce boxes we will still do this with a recursive call to eval-exp on the body, only now we need to box the argument values as we extend the environment. We handle let-expressions in the same way.

There are two ways to implement these changes: you can either change the calls to env to map box onto the values, or change the code for env itself to always box values when it puts them in a new environment. Your choice!

At this point your interpreter should be running exactly as it did for MiniScheme F: let-expressions, lambda expressions and applications should all work correctly. Make sure this is the case before you proceed!

Adding `set!` and `begin`

MiniScheme G will implement variable assignment in the form of set! expressions. Note that we will not be implementing set! as a primitive function, but as a special form. When evaluating (set! x 5) we don’t want to evaluate variable x to its previous value, as a call would, but rather to store value 5 in its box.

The grammar for MiniScheme G is as follows:

EXP → number                      parse into lit-exp
    | symbol                      parse into var-exp
    | (if EXP EXP EXP)            parse into ite-exp
    | (let (LET-BINDINGS) EXP)    parse into let-exp
    | (lambda (PARAMS) EXP)       parse into lambda-exp
    | (set! symbol EXP)           parse into set-exp
    | (begin EXP*)                parse into begin-exp
    | (EXP EXP*)                  parse into app-exp
LET-BINDINGS → LET-BINDING*
LET-BINDING → [symbol EXP]
PARAMS → symbol*

Let’s start with set!.

We need to extend eval-exp to handle set-exp tree nodes. This is just a matter of putting all of the pieces together: we lookup the symbol from the expression (the variable being assigned to) in the current environment; this should give us a box. We call set-box! on this box with the value we get from recursively calling eval-exp on the expression part of the set-exp.

Here is what we can do when this is implemented:

MS> (set! + -) 
#<void>
MS> (+ 2 2) 
0 
MS> (set! + (lambda (x y) (- x (negate y)))) 
#<void>
MS> (+ 2 2) 
4 
MS> (+ 2 5) 
7 

Now that we have introduced side effects, it seems a natural next step to implement sequencing of expressions which we will do via begin. The grammar for MiniScheme G contains the following rule.

EXP → (begin EXP*)

Evaluating (begin e1 e2 ... en) results in the evaluation of e1, e2, …, en in that order. The returned result is the last expression, en.

A begin-exp holds a list of parsed expressions. You will need to think about how to add begin-exp to your eval-exp procedures. You need to iterate through the list of expressions in such a way that

(let ([x 1] [y 2])
    (if (gt? x y)
        (begin (set! x 24)
               x)
        (begin (set! y 25)
               y)))

returns 25; the whole point of begin is that the subexpressions might have side effects that alter the environment.

Again, a reminder to add parse and interpreter tests as you go. Make sure to commit your code when you are done with Part G as well.

Part H: Recursion

It may seem like we are done, but let’s see what happens if we try to define a recursive procedure (for instance, factorial!) in MiniScheme G using normal let:

MS> (let ([fac (lambda (n)
                 (if (eqv? n 0)
                     1
                     (* n (fac (- n 1)))))])
      (fac 4)) 

This gives an error message saying there is no binding for fac. The problem is in the recursive call to fac in (* n (fac (- n 1))). When we evaluated the lambda expression to create the closure, we did so in an environment in which fac was not bound. Because procedures use static environments when they are executed, the recursive call failed. The same thing happens in Scheme/Racket: this is exactly why we have letrec!

Recall what happens when a procedure is created. A closure is created that contains the environment at the time the lambda was evaluated, along with the body of the function and the formal parameters. MiniScheme F has no problems with this.

When a procedure is called the free variables in the body are looked up in the environment that was present at the time the lambda was evaluated. This is where MiniScheme ran into problems with the factorial example: the variable fac in the line

(* n (fac (- n 1))))))

was not bound to anything at the time the procedure was created, and so we got an error.

There is a surprisingly simple way to get around this problem. Try running the following code in minischeme.rkt based on your MiniScheme G implementation:

MS> (let ([fac 0])
      (let ([f (lambda (n)
                 (if (eqv? n 0)
                     1
                     (* n (fac (- n 1)))))])
        (begin
          (set! fac f)
          (fac 4))))

This works correctly and you can actually follow this pattern for all recursive functions!

Therefore, we can consider recursive procedures to be “syntactic sugar” (syntactic simplifications for ease of use) and, to implement them, we will rewrite letrec-expressions as let-expressions inside let-expressions with set!s to tie everything together.

Here is the grammar for our final language, MiniScheme H:

EXP → number                      parse into lit-exp
    | symbol                      parse into var-exp
    | (if EXP EXP EXP)            parse into ite-exp
    | (let (LET-BINDINGS) EXP)    parse into let-exp
    | (letrec (LET-BINDINGS) EXP) translate to equivalent let-exp
    | (lambda (PARAMS) EXP)       parse into lambda-exp
    | (set! symbol EXP)           parse into set-exp
    | (begin EXP*)                parse into begin-exp
    | (EXP EXP*)                  parse into app-exp
LET-BINDINGS → LET-BINDING*
LET-BINDING → [symbol EXP]
PARAMS → symbol*

The way we are handling letrecs is what is known as a syntactic transformation. When the parser sees a letrec expression it can either produce an equivalent let expression and parse that, or it can directly create the appropriate let-exp tree. Either works!

Implementation Details

To implement MiniScheme H you should only have to modify the parser (remind yourself why this is before you start). You may want to use a helper function parse-letrec to do the work so you don’t clutter your parser. You may want to make helper functions for each of the special forms, although you certainly do not have to.

In the factorial example above, we first bound fac to 0. Then, in an inner let-expression, we defined a new, placeholder variable f to the standard, recursive implementation of factorial. Finally, we used begin and set! to set the value of fac to our placeholder variable f.

When creating the placeholder variables, we don’t want to shadow existing bindings, so we need a way to create some fresh symbols to use as the placeholders. The Racket procedure gensym always returns fresh, unused symbols that we can use for this purpose.

> (gensym)
g62

When you have adapted your parser, the following examples should work in your miniScheme implementation:

MS> (letrec ([fac (lambda (x) (if (eqv? x 0) 1 (* x (fac (sub1 x)))))]) (fac 4))
        24
MS> (letrec ([fac (lambda (x) (if (eqv? x 0) 1 (* x (fac (sub1 x)))))]) (fac 10))
        3628800
MS> (letrec ([even? (lambda (n) (if (eqv? 0 n) True (odd? (sub1 n))))]  
             [odd? (lambda (n) (if (eqv? 0 n) False (even? (sub1 n))))] )
   (even? 5))
False

Make sure you have written additional tests for the parser that address letrec as well as tests testing the evaluation of recursive functions.

Congratulations! You have reached the end of Homework 8 and should have a working MiniScheme that supports recursion. This is quite a feat!

Optional: Code Refactoring

Once you’ve finished your implementations of both parse.rkt and interp.rkt, you might want to consider doing a refactoring pass. Make sure to commit your working code before you embark on this task or make copies of your implementations to save!

Refactoring is the idea of taking an existing code base and reworking it to be easier to read and debug. Depending on the code base there are many ways to refactor, but here you might consider adding in additional abstraction layers or introducing calls to helper functions.

Some elements that might lend themselves to refactoring are below, although there are likely other places in your implementation as well:

How you handle 'True and 'False in apply-primitive-op - if you have a lot of repetitive if statements, think about how you could replace them with a helper procedure!
The cond cases in parse, especially the (list? input) subcases.
When/how to throw errors: can you build a more generic “error helper” procedure that you reference at different locations?

I recommend taking a bit of time on this step, as both a victory lap for completing the project and as a way to reinforce what each element is doing and how you can write more streamlined, elegant code.

Finishing up

Make sure you (and your partner, if you have one!) have signed the Honor Code and then add/commit/push. Also make sure your name(s) is in all of the files you submit. Your friendly graders thank you!