Homework 5: MiniScheme Beginnings

Due: Friday, October 18 at 23:59
First Commit: Monday, October 14 at 23:59

This assignment helps you set the foundation for the MiniScheme project. Specifically, you’ll start by implementing environments and then continue to implementing to MiniScheme parts A & B.

This is the first of three homework assignments that constitute the MiniScheme project.

Your implementations will be in several files. The start of each file should be

#lang racket
; Your name(s) here.

Preliminaries

Click on the assignment link. You should make a new team as normal. However, you cannot change teams for subsequent MiniScheme Homeworks, so your team name and choice of partner should reflect this. I might recommend a variant of “steve-elinor-minischeme” as a team name.

As a reminder, if you’re working with a partner, one partner should create a new team. The second partner should click the link and choose the appropriate team. It’s extremely helpful for the graders if you include your name and your partner’s name in the team name. (Please don’t choose the wrong team, there’s a maximum of two people and if you join the wrong one, you’ll prevent the correct person from joining.)

Once you have accepted the assignment and created/joined a team, you can clone the repository on your computer by following the instruction and begin working. But before you do, read the entire assignment and be sure to check out the expected coding style, as posted on Ed.

Submission

To submit your homework, you must commit and push to GitHub before the deadline.

Your repository should contain the following files

env.rkt
env-tests.rkt
parse.rkt
parse-tests.rkt
interp.rkt
interp-tests.rkt
minischeme.rkt
HONORCODE.md

It may also a .gitignore file which tells Git to ignore files matching patterns in your working directory.

Any additional files you have added to your repository should be removed from the main branch. (You’re free to make other branches, if you desire, but make sure main contains the version of the code you want graded.)

Make sure you put your name (and your partner’s name if you’re working with one) as a comment at the top of each file.

MiniScheme Environments

As a reminder, we will be building an interpreter over the next few weeks for a subset of the Scheme language. The first thing we need to do is develop a way to hold bindings and their values to execute code. In other words: we need environments!

Overview

We need to create an environment to hold the data for the expressions we will interpret. The scoping rules for Scheme determine the structure of this environment. Consider the following three examples. First

(let ([x 2] [y 3])
  (+ x y))

This has an environment with bindings for x and y created in a let-block. These bindings are used in the body of the let.

Next, consider the following. This has a let-block that creates a lambda expression, which is called in the body of the let:

(let ([f (lambda (x) (+ x 4))])
  (f 5))

When this is evaluated we want to bind x to the value of the argument, 5, and then evaluate the body of f using that binding.

Finally, we combine these. At the outer level in the following expression we have a let-block with bindings for x and y. The body is another, nested, let, which binds a lambda expression with a parameter x. The body of the interior let is a call to the function.

(let ([x 2] [y 3])
  (let ([f (lambda(x) (+ x y))])
    (f 5)))

When we evaluate this we first make an environment with bindings of x and y to 2 and 3 respectively, then we use this to evaluate the inner let-expression. In that expression we make a binding of f to the value of the lambda-expression (a closure), and then we call f with argument 5. This requires us to evaluate the body of f with x bound to 5. The body of f does not have a binding for y, so we look it up in the outer environment and see that its value is 3. Finally, we evaluates (+ x y) with x bound to 5 and y bound to 3, yielding 8 for the value of the full expression.

Environments are extended in two ways. Let expressions have bindings that extend the current environment; the body of the let is evaluated in the extended environment. Lambda-expressions do not extend the environment; they evaluate to closures that store the current environment from the location where the (lambda (...) ...) is evaluated.

When the closure that results from a lambda-expression is called, the closure’s environment is extended with the parameters from the lambda-expression being bound to the values of the arguments.

We will define the environment as an association list, where symbols are associated with values. There are two ways we might do this. In the first example above, where x is bound to 2 and y to 3, we might use the list '((x 2) (y 3)), or we might use '((x y) (2 3)). The former structure is closer to the way the bindings appear in let-expressions; the latter is closer to the components of a call. The former structure might appear simpler, but the latter is actually easier to code and we will go with that.

Scheme, and many other languages you have and likely will learn, employs lexical scoping. When we want to resolve the binding for a free variable, we look first in the current scope, then in the surrounding scope, then in the scope that surrounds that, until the variable is found or we reach the outermost scope. To implement this our environments will be structured as a “list” (hint: what should it actually be if not a literal list as described here?) with four elements: the symbol 'env, a list of symbols, a list of values, and the previous environment. The top-most environment does not have a previous-environment (by definition) so we’ll use null to represent the empty environment.

Thus, the environment for the expression

(let ([x 2] [y 3])
  (let ([z 4] [x 5])
    (+ x (+ y z))))

will be something like

'(env (z x) (4 5)
      (env (x y) (2 3)
           ()))

When we resolve the bindings for x, y and z to evaluate (+ x (+ y z)) we find the binding 5 for x (there are two bindings for x, but the one we want is is the first one we come to), and of course we find 4 for z and 3 for y. This leads to the correct value, 12 for the expression.

Similarly, in the expression

(let ([x 2] [y 3])
  (let ([f (lambda(x) (+ x y))])
    (f 5)))

we evaluate the call (f 5) by evaluating the body of f in an environment that first has x bound to 5, and then has the environment surrounding the definition of f.

You will see in later parts how this environment is created. At present we need to create the tools that will allow this.

Part 1: The environment data type

The two most important features of an environment are that we need to be able to look up symbols to get the values they are bound to, and we need to be able to extend an environment with new bindings to get a new environment. We’ll define an environment as either the empty environment, with no bindings, or an extended environment with a list of symbols, a corresponding list of the values those symbols are bound to, and a previous environment that is being extended. Here is the definition of the empty environment.

; The empty environment is null.
(define empty-env null)
(define empty-env? null?)

How will we build the environment? Well, we want a constructor such that the following works:

(env '(x y) '(1 2) empty-env)

Recognizer

(env? x)

And accessors.

(env-syms e)
(env-vals e)
(env-previous e)

Now, define the new data type in env.rkt. Use built-in Racket functionality to define this data type: do NOT implement the constructor, recognizers, or accessors yourself!

Note that env is the procedure you will use every time you need to extend an environment when evaluating a let-expression or a function call. For example, when evaluating the expression (let ([x 1] [y 2]) ...) we might use the following.

(define env-a
  (env '(x y) '(1 2) empty-env))

We can then further extend the environment as follows:

(define env-b
  (env '(x z) '(5 7) env-a))

Part 2: Implementing the environment lookup

After Part 1 above, env.rkt contains code which describes the environment data type. Part 2 asks you to implement env-lookup (35 points) and write thoughtful related tests (see details below) (15 points).

Add to env.rkt the procedure (env-lookup environment symbol), which takes an environment and a symbol and returns the first binding for that symbol in the environment. For example, with environments env-a and env-b defined as

(define env-a
  (env '(x y) '(1 2) empty-env))
(define env-b
  (env '(x z) '(5 7) env-a))

we should have the following behavior:

(env-lookup env-a 'x) should return 1
(env-lookup env-b 'x) should return 5
(env-lookup env-b 'y) should return 2
(env-lookup env-b 'foo) should cause an error

If env-lookup does not find a binding for the symbol you should throw an error as below:

(error 'env-lookup "No binding for ~s" symbol)

Make sure you (provide env-lookup) at the top of env.rkt so that modules you’ll write in future parts (and in the tests described below) can use that procedure.

The file env-tests.rkt contains a test-env environment that maps x to 1 and y to 2. In addition it defines an env-tests test suite that tests the basic behavior of the environment data type. Extend this test suite with additional tests for env-lookup. You will probably want to define new extended environments for your tests.

In particular, be sure to test at least the following situations:

Looking up a symbol that’s not bound in an environment throws an error. The test below depends on (env-previous empty-env) throwing an error. You’ll want something very similar and you should use the test-exn procedure:
```
(test-exn "Empty environment has no previous"
          exn:fail?
          (λ () (env-previous empty-env))
```
Looking up a symbol in an empty environment throws an error.
Looking up a symbol that’s bound in an environment returns its value.
Looking up a symbol that’s not bound in an environment but is bound in the environment’s previous environment returns the correct value.
Looking up a symbol that’s bound in an environment and also in the environment’s previous environment returns the correct value (see examples above).

All testing for MiniScheme runs through tests.rkt. Run this file to call the tests you wrote in env-tests.rkt.

Now that you have a working environment, it would be a great time to commit your code!

Overview of MiniScheme A & B

We will start with a very basic version of our MiniScheme language, version A, and gradually add language features. As we do so, we will update our parser and interpreter to implement the new language features.

As mentioned in lecture, the parser and interpreter for MiniScheme will reside in files parse.rkt and interp.rkt, respectively. After you complete each of the remaining parts, you should be sure to commit your code. That way, you can always retrieve an earlier version if you need to.

Here’s what you’ll have in each file.

parse.rkt: Tree data types and the parse procedure which converts the input to a tree to interpret;
interp.rkt: Initially just eval-exp procedure which takes a parse tree and an environment and produces a value and init-env which will hold the initial environment; and
env.rkt: Environment data type and env-lookup procedure

As you go, think about where to put new definitions. Racket has trouble with circular requirements (such as parse.rkt requiring interp.rkt and interp.rkt requiring parse.rkt). You probably want your parse.rkt and env.rkt files to not require any other module, and your interp.rkt file to require both parse.rkt and env.rkt.

In addition to those files, we’ll have some testing files.

parse-tests.rkt: This defines a test suite for the parser;
interp-tests.rkt: This defines a test suite for the interpreter;
env-tests.rkt: This defines tests for the environment; and
tests.rkt: This combines the other three test suites into a single all-tests test suite and runs it.

Start by creating files parse.rkt and interp.rkt. Make sure they include the #lang line and your name(s).

MiniScheme A: Numbers

Version A of MiniScheme is given by the following grammar.

EXP → number                      parse into lit-exp

Parsing

Our parse procedure will take an input expression and return a parse tree for an EXP. The only expressions in MiniScheme A are numbers.

In parse.rkt we need a data type to hold EXP nodes that represent numbers. We’ll call this data type lit-exp. You need to implement it so that it contains the numerical value being represented, has a constructor, recognizer, and getter (a transparent struct is recommended!).

In the instructions I will use the names lit-exp (constructor), lit-exp? (recognizer), and lit-exp-num (getter), although you can call them whatever you’d like.

The parse function (parse input) should create a lit-exp when it sees a number (and throw an error otherwise).

Write the parse function and and the code that implements the lit-exp data type, in parse.rkt. Make this file into a module by adding the appropriate provide lines to make the parse procedure and the procedures for lit-exp (or whatever you call it) available to other modules.

In parse-tests.rkt, add tests for parse. In particular, you should test that (parse 5) returns something that lit-exp? evaluates to true. The test-pred procedure, as shown below, will likely be helpful.

(test-pred "Literal"
           lit-exp?
           (parse 5))

Also test that when you parse a number, you can extract the number from the resultant parse tree using lit-exp-num.

Evaluating

For the interpreter, you know that Scheme evaluates all integers as themselves. Therefore, eval-exp is simple. Write (eval-exp tree e) to evaluate a lit-exp and throw an error otherwise. Save this code as interp.rkt. Make sure you require parse.rkt and (for the next step) env.rkt. Similarly, make sure you provide eval-exp for use by other modules.

If you run interp.rkt in DrRacket to load the interpreter and parser into memory, you can type the following:

> (define T (parse '23))
> (eval-exp T empty-env)

and it should print the value 23.

It quickly becomes tedious to always invoke your interpreter by specifically calling eval-exp after calling the parser on the quoted expression. It would be nice if we could write a read-eval-print loop for MiniScheme. This is precisely what minischeme.rkt does.

Running minischeme.rkt in DrRacket will give you an input box that allows you to type expressions and get back their value as determined by your parse and interp modules. For example, if you enter the MiniScheme expression 23 this evaluates it and prints its value 23.

Running into issues? The read-eval-print procedures assumes that your parse procedure is named parse that your evaluator is called eval-exp that takes as arguments a parse tree and an environment, in that order; and an initial environment named init-env. Please keep these names consistent so that testing & minischeme.rkt works throughout the project!

Now inside interp.rkt, define and provide an init-env. For the init-env, use your env constructor to create an environment mapping x to 23 and y to 42. These will be helpful in future parts, so its a good idea to define them now.

The last thing we need to do for MiniScheme A is to write some tests. In interp-tests.rkt, write a test like

(test-equal? "Number"
             (eval-exp (lit-exp 5) empty-env)
             5)

Note that we’re explicitly passing the parse tree (lit-exp 5) rather than calling (parse 5). This lets you separately test parsing from evaluating. In subsequent parts, you’ll be modifying parse. You don’t want a bug in parse to show up as a bug in eval-exp. By not using parse in your interpreter tests, you can keep the two separate.

Congratulations, at this point you should have a working interpreter for MiniScheme A! Make sure you commit your work.

MiniScheme B: Variables & Values

Let’s extend MiniScheme A by adding symbols. So, the grammar for MiniScheme B is:

EXP → number                      parse into lit-exp
    | symbol                      parse into var-exp

The parser is a simple modification to parse. You need to add a line to (parse input) to handle the case where (symbol? input) is #t. You will need a new data type to handle this (I call it var-exp).

To evaluate a variable expression, MiniScheme B needs to be able to look up references. We evaluate a var-exp tree node in a given environment by calling lookup in that environment on the var-exp-symbol. Since we asked you to include bindings for symbols x and y in the initial environment in Part 4, you should be able to evaluate the MiniScheme expressions x or y to get their values. Any other symbol at this point should give you an error message.

Make sure to add tests to both parse-tests.rkt and interp-tests.rkt to address MiniScheme B.

Common Questions

Should our environments handle more than 2 possible bindings? Yes! Even though the examples only give you two possible bindings, your environments should be robust enough to handle n bindings simultaneously.
Should we test the environment data type separately from env-lookup? No, tests for env-lookup should be sufficient.
If we have (define env-a (env '(x y x) '(1 2 3) empty-env)), what would (env-lookup env-a 'x) return? 1

Finishing up

Make sure you commit and push. And make sure your name is in all of the files you submit. Your friendly graders thank you!