Type Soundness -------------- Today we're going to show that if we evaluate programs that type check, they won't crash. We'll show this for a reduced languge: (types) t ::= num | t1 -> t2 (expressions) e ::= id | (lambda (t id) e) | (e e) | i | (+ e e) (values) v ::= i | (closure id t e env) Evaluation rules ---------------- The first step is to define evaluation as an inductively defined relation. This definition should exactly match the scheme implementation of evaluation. ----------------- (eval i env) = i (eval e1 env) = i1 (eval e2 env) = i2 i1 + i2 = i ----------------------- (eval (+ e1 e2) env) = i ------------------------------- (eval id env) = (lookup env id) ---------------------------------------------------- (eval (lambda (t id) e) env) = (closure t id e env) (eval e1 env) = (closure t id e1' env') (eval e2 env) = v (eval e1' (extend env id v)) = v' --------------------------------------- (eval (e1 e2) env) = v' Typing rules ------------ -------------- E |- i : num E |- e1 : num E |- e2 : num -------------- E |- (+ e1 e2) : num -------------- E |- id : E(t) E+{x:t1} |- e : t2 ------------------------------ E |- (lambda (x) e) : t1 -> t2 E |- e1 : t1 -> t2 E |- e2 : t1 --------------------- E |- (e1 e2) : t2 Note: your book writes some of these rules as (type-of-expression <> [x=t1]tenv) = t2 --------------------------------------------- (type-of-expression <<(lambda (t1 x) e)>> tenv) = t1 -> t2 What do we want to prove? We would like to show that no errors happen during evaluation. This is hard to talk about now because an error happening means that there just isn't any derivation. (eopl:error is a weird construct.) So let's change evaluation to make these errors explicit. In doing so, evaluation will be a *total* function--no matter what expression and environment we get it, we will get some result. We add a new value, called *wrong* to indicate the error case. If (eval e env) = wrong then some bad thing happened. We also add new rules propagating this error. (Incidently, if you didn't have error or exceptions in your language, this is how you would *have* to write your interpreter...) (eval e1 env) = v (not i) --------------------- (eval (+ e1 e2) env) = wrong (eval e2 env) = v (not i) --------------------- (eval (+ e1 e2) env) = wrong (if x is not in env) ----------------------- (eval x env) = wrong (eval e1 env) = v (not closure) -------------------------- (eval (e1 e2) env) = wrong *change correct rule to disallow wrong from v* (eval e1 env) = (closure t id e1' env') (eval e2 env) = v (v is not wrong, don't add wrong to env.) (eval e1' (extend env id v)) = v' --------------------------------------- (eval (e1 e2) env) = v' (eval e2 env) = wrong -------------------------- (eval (e1 e2) env) = wrong Lemma: (eval e env) is always defined, no matter what e is. Proof: structural induction on e. case e = id - we have a result no matter whether id is in env or not. case e = i e = (lambda (t x) e) - already had cases for these case e = (+ e1 e2) cases for when e1 evaluates to i and not i and for when e2 evaluates to i and not i case e = (e1 e2) cases for when e1 evaluates to a closure and when it didn't cases for when e2 evaluates to a non-wrong value, and when it doesn't Type soundness (Milner 1978): If empty |- e : t and (eval e empty) = v and then v : t. (This theorem is based on an invariant, that e doesn't change type as it executes....) Notes: What does it mean for values to have types? We need to define that. i : num E+{id:t} |- e : t2 ------------------------------- (closure id t e env) : t1 -> t2 what is E? take the environment and if x maps to v in the environment, then add x:t where v has type t. This works as long as there are no recursively defined values in the environment (such as defined by letrec). In that case, the environment will have to remember the types or something. The wrong value does not have a type. What does that mean about the theorem? If e is well typed, then e will evaluate to a value that has a type. i.e. e will *not* evaluate to wrong. i.e. no type errors will occur while evaluating e. i.e. "Well-typed programs will not go wrong." How do we prove this theorem? By induction. Actually, we need to prove a stronger theorem. If E |- e : t and (eval e env) = v and E and env are compatible then v : t What do we mean by E and env are compatible? If E(x) = t and (apply-env env t) = v, then it better be the case that v : t. Ok, now we can do the proof. We have to consider all of the cases for (eval e env) = v. Much to the annoyance of programming language researchers, there are 10 of them. Luckily, some are quite easy. case: (eval i env) = i Easy case! case: (eval id env) = (apply-env env id) If E |- id : E(id), where E is compatible with env, then we know that env(id) : E(id) from the definition of compatible. case: (eval id env) = wrong (if x is not in env) By assumption, E |- id : t for some E compatible with env. If this is true, then there must be a mapping for x in env, otherwise, E wouldn't be compatible. So this case couldn't happen! case: (eval (lambda (t1 id) e) env) = (closure t id e env) So E |- (lambda (t1 id) e) : t The only way for this to happen, is if t = t1 -> t2 E+{id:t1} |- e : t2 Now can we show that (closure t id e env) : t1 -> t2 ? Look at the definition, E is compatible with env, so it describes the types of the variables in env. case: (eval (+ e1 e2) env) = i So assume that E |- (+ e1 e2) : t for some compatible E. It must be the case that t = num. And i: num. Good. case: (eval (+ e1 e2) env) = wrong where (eval e1 env) = closure or wrong. So assume that E |- (+ e1 e2) : num for some compatible E. It must be the case that E |- e1 : num. By induction the closure or wrong must be of type num. But that is not the case, it is a contradiction. So this case cannot happen at all. *Note we really have to trust induction here, but this is the correct reasoning.* case: (eval e1 env) = (closure t id e1' env') (eval e2 env) = v (v not wrong) (eval e1' (extend env id v)) = v' --------------------------------------- (eval (e1 e2) env) = v' We assume E |- (e1 e2) : t, so it must be the case that E |- e1 : t1 -> t and E|- e2 : t1. By induction: (closure t id e1' env') : t1 -> t and v : t So we know that E'+{id:t} |- e1' : t for some E' compatible with env'. This is from the definition of value typing. So by induction, v' : t, which is what we wanted to prove. case: The other three cases are analogous to next to last case.