Implementing Type-Checking, CSC430, Spring 2024
1 Goal
Extend the interpreter to include a type system.
2 Guidelines
For this and all remaining assignments, every function you develop must come with the following things:
A commented header line that expresses the result of the function in terms of its inputs, written in English. Be as precise as you can within the space of a line or two.
A type declaration (possibly inline), specifying the input and output types.
Test cases. A function without test cases is incomplete. Write the test cases first, please.
For this assignment, you must develop your solutions using the typed/racket language. If you haven’t seen them, you might be interested in these Hints on Using Typed Racket in CPE 430.
Your test cases must use the check-equal?, check-=, or check-exn forms.
Your solution should take the form of a single file.
Hand in your solution using the handin server. For help with the handin server, please see the course web page.
2.1 Handling Errors
All of your error messages must contain the string "ZODE". Essentially, this allows my test cases to distinguish errors correctly signaled by your implementation from errors in your implementation. To be more specific: any error message that doesn’t contain the string "ZODE" will be considered to be an error in your implementation.
Additionally, your error messages should be actually helpful. Since you are the primary consumer of your own error messages, making these error messages good in the first place should reduce your overall development time. There are two parts to this: first, the error message should include text that actually indicates what the programmer did wrong. Second, include the text of the user’s program, so they (actually you) can figure out how to fix it. See lab 3 for an example of how to do this. (Apologies in advance if I renumber the labs and fail to update this paragraph....)
2.2 Progress Toward Goal comment
Graders are happier when they know what to expect. Your final submission should start with a short one- or two-line comment indicating how far you got through the project. Ideally, this would just be: “Full project implemented.” But if you only implemented, say, squazz and blotz, and didn’t get to frob or dringo, please indicate this in the comment, so that we don’t spend all our time searching for bits that aren’t there.
3 The Assignment
This assignment will build on the Assignment 4 interpeter.
4 Syntax of ZODE7
A ZODE7 program consists of a single expression.
The concrete syntax of the ZODE7 expressions with these additional features can be captured with the following EBNFs.
| ‹expr› | ::= | ‹num› |
|
| | | ‹id› |
|
| | | ‹string› |
|
| | | { if : ‹expr› : ‹expr› : ‹expr› } |
|
| | | { locals : ‹clauses› : ‹expr› } |
|
| | | { local-rec : ‹id› = ‹lamb-def› : ‹expr› } |
|
| | | ‹lamb-def› |
|
| | | { ‹expr› ‹expr›* } |
| ‹lamb-def› | ::= | { lamb : [‹ty›‹id›]* -> ‹ty› : ‹expr› } |
| ‹clauses› | ::= | ‹ty› ‹id› = ‹expr› |
|
| | | ‹ty› ‹id› = ‹expr› : ‹clauses› |
| ‹ty› | ::= | num |
|
| | | bool |
|
| | | str |
|
| | | { ‹ty›* -> ‹ty› } |
| ‹top-level-constants› | ::= | true |
|
| | | false |
| ‹top-level-functions› | ::= | + |
|
| | | - |
|
| | | * |
|
| | | / |
|
| | | <= |
|
| | | num-eq? |
|
| | | str-eq? |
|
| | | substring |
| ‹keywords› | ::= | if |
|
| | | locals |
|
| | | local-rec |
|
| | | = |
|
| | | -> |
|
| | | lamb |
... where an id is not one of the keywords.
4.1 Primitives
procedure
(+ a b) → num
a : num b : num
procedure
(- a b) → num
a : num b : num
procedure
(* a b) → num
a : num b : num
procedure
(/ a b) → num
a : num b : num
procedure
(<= a b) → boolean
a : num b : num
procedure
(num-eq? a b) → boolean
a : num b : num
procedure
(str-eq? a b) → boolean
a : str b : str
procedure
(substring str begin end) → string
str : str begin : num end : num
value
true : boolean
value
false : boolean
4.2 Type Checking
Implement a type checker for your language. Note that since functions must now come annotated with types for arguments, you will need to have a type parser that parses types. You should definitely make a separate function for this. Note that the types of functions are extended to handle multiple arguments. So, for instance, the type {num str -> bool} refers to a function that accepts two arguments, a number and a string, and returns a boolean.
All type rules are standard.
4.2.1 Binops
Type-checking binops is more or less as you might expect. For instance, a + should receive two numbers, and will produce a number. The <= operator will take two numbers, and return a boolean.
The equal? operator is a bit different. Specifically, we don’t have subtyping, and we treat the equality operator as a function in the environment, so it must have a single type. In order to simplify our lives, we split it into two equality operators; one that only works for numbers, called num-eq?, with type {num num -> bool}, and one that only works for strings, called str-eq?, with type {str str -> bool}.
5 Recursion
This assignment will add recursion, using a rec form. Implement the rec form as described in the book. Note that you will need some kind of mutation in order to create the closure as a cyclic data structure. For this assignment, we’ll keep things simple and just use Typed Racket’s built-in mutation in order to eliminate the need for store-passing style. I recommend designing your (runtime) environment as a mapping from names to boxed values.
Here’s an example of a simple program that defines a recursive function to compute perfect squares in a kind of silly way:
{locals-rec : square-helper = {lamb : [num n] -> num : {if : {<= n 0} : 0 : {+ n {square-helper {- n 2}}}}} : {locals : {num -> num} square = {lamb : [num n] -> num : {square-helper {- {* 2 n} 1}}} : {square 13}}}
6 A note on desugaring of locals
In this assignment, the locals form can still be implemented using desugaring. This introduces a small wrinkle into type-checking, because locals expands into an AppC containing a LamC, and our new LamC will almost certainly include a return type, but in the case of a locals, we don’t know what that return type should be until we do type-checking.
To solve this, we suggest that the return type field in your LamC be of type (Option Type); in Typed Racket, (Option T) is just a shorthand for (U T False), the union of the given type with the type False, which contains only the false value.
In the type checker, this will work out fine; if there’s no specified return type to check, your type-checker can return the computed type for the body as the return type of the function.
7 Suggested Implementation Strategy
Here are some of the steps that I followed. I wrote test cases for every step before implementing it.
Add a data definition for Types.
Add a parser for the language of types.
Develop a type checker that only works for constants.
Add a definition for Type Environments, mapping symbols to types.
Extend your type checker to handle variables and if expressions.
Extend your type checker to handle applications. (Since you don’t have types for LamC terms yet, your test cases should just use primitive functions.)
Extend your AST nodes for LamC nodes to include types with the parameters.
Extend your type checker to handle LamC terms.
Extend your parser and type checker to handle the modified locals and local-rec terms.
Add your type checker to top-interp. Type checking should happen after parsing and before interpretation.
Update your environment to map names to boxed values.
Fix breakage by adding boxing and unboxing to environment operations.
As in the previous assignment, add code to recreate the top-env on every call to top-interp.
Add the rec form to the set of ExprCs.
Fix breakage temporarily by signaling an error in interp on instances of RecC.
Develop the parser clause for RecC. Test cases first!
Develop the interpreter clause for RecC. Test cases first!
8 Interface
Make sure that you include the following functions, and that they match this interface:
procedure
(parse s) → ExprC
s : Sexp
procedure
(parse-type s) → Ty
s : Sexp
procedure
(type-check e env) → Ty
e : ExprC env : TEnv
value
base-tenv : TEnv
procedure
(interp e env) → Value
e : ExprC env : Environment
procedure
(top-interp s) → string
s : s-expression
9 Hint
As on other assignments, uses of "cast" other than in the parser, when using a question-mark-pattern with ellipses, are likely to be bugs. The autograding framework highlights these as likely bug locations. It’s almost certainly not a good idea to use casting in your code.
10 Minimum Viable Product
Some of you may be thinking about submitting your solution to assignment 4 for this assignment. Good idea! It won’t actually do any type checking, but it will certainly still pass some of the tests.
If you’re thinking of doing this, the most important change you’ll have to make to your code is to handle the presence of types in lamb and locals terms; update your parser to just ignore these extra pieces, and **profit***!