ISemantics




We augment this operational semantics to keep track of metadata and potential memory errors. A Result can be RVal which is a value with metadata (the result of a right-hand-side), an address RLoc (the result of a left-hand-side), ROK which is the the result of a successful command, Abort when memory-safety check fails, or SystemError when frame allocation, frame deallocation, malloc or free fails.

Definition Data := (Value*Meta)%type.

Inductive Result: Set :=
| ROK : Result
| RLoc : (Loc*Key*LockAddr) -> Result
| RVal : Data -> Result
| Abort : Result
| SystemError : Result
.

Definition Error (R: Result) := (R = Abort) \/ (R = SystemError).

To fully capture the instrumentation performed by CETS, it is also necessary for the operational semantics to propagate some type information. For example, before doing a dereference, CETS must do the bounds check according to pointers' base and bound information and the size of their types. Similarly, when casting integers to pointers CETS sets pointers' metadata to zero.

At cast, and assignment, metedata should be updated w.r.t the source and the target types. Integers can be cast to pointers with the restriction that the base and end fields are set to zero. In conjunction with assertion_spatial_ptr checks, this restriction ensures that pointers obtained from Integers cannot be dereferenced. In reality, casts from pointers to integers and back to pointers are actually quite rare in C programs. CETS' implementation allows users to set base and bound for such a cast.
Definition dataCast (to:AType) (d:Data) (from:AType) : Data :=
  match (to, d, from) with
  | (A_Pointer p, (v, (b, e, r, ri)), A_Int) => (v, (0, 0,0,0))
  | _ => d
  end.

At dereference, we ensure that the location is safe to access w.r.t its metadata.
Definition assertion_spatial_ptr (v:Value) (be:Meta) (t:AType) :=
  match (v, be) with
  | (l, (b, e, r, ri)) =>
     match t with
     | A_Pointer (P_AType t) =>
         b <> 0 /\ b <= l /\ l + sizeOfAType t <= e
     | A_Pointer (P_Struct s) =>
         b <> 0 /\ b <= l /\ l + sizeOfStruct s <= e
     | A_Pointer (P_Name n) =>
       match (typeTable n) with
       | Some s =>
           b <> 0 /\ b <= l /\ l + sizeOfStruct s <= e
       | None => False
       end
     | A_Pointer P_Void => False
     | A_Int => True
     end
  end.

Definition assertion_temporal_ptr (be:Meta) (t:AType) RI :=
  match be with
  | (b, e, r, ri) =>
     match t with
     | A_Pointer (P_AType t) => r = RI ri
     | A_Pointer (P_Struct s) => r = RI ri
     | A_Pointer (P_Name n) =>
       match (typeTable n) with
       | Some s => r = RI ri
       | None => False
       end
     | A_Pointer P_Void => False
     | A_Int => True
     end
  end.

Definition assertion_ptr (v:Value) (be:Meta) (t:AType) RI :=
  assertion_spatial_ptr v be t /\ assertion_temporal_ptr be t RI.

These considerations lead to three large-step evaluation rules. Left-hand-sides evaluate to a result r (which must be an address l if successful) and its atomic type a. Such an evaluation has no effect on the environment, so we write the rule as: E |- lhs -->* r : a. Evaluating a right-hand-side expression also yields a typed result, but it may also modify the environment E via memory allocation, causing it to E': E |- rhs -->* r : a -| E' (where r must be v with metadata b, e, k, and la if successful). Commands simply evaluate to a result, which must be ROK if successful, and final environment: E |- c -->* r -| E', where assignment statements or free can update the environment.

Space precludes showing the full set of operational rules (most of which are completely standard or obvious). Instead, we highlight a few cases that illustrate the salient features. For instance, the following examples show how to evaluate a pointer dereference when the bounds check succeeds S_Deref and fails S_Deref_Abort:

     E |- lhs -->* l : a*
     [accessMemMeta] E.M l = l'_(b,e,k,la) 
     E.RI la  = k
     b <= l' <= e - sizeof(a) 
     --------------------------------------- :: S_Deref
     E |- *lhs -->* l' : a  
     E |- lhs -->* l : a*
     [accessMemMeta] E.M l = l'_(b,e,k,la) 
     E.RI la  <> k \/ (~ b <= l' <= e - sizeof(a))
     ----------------------------------------------- :: S_Deref_Abort
     E |- *lhs -->* Abort : a


Here, lhs is first evaluated to l, which is the address of a pointer with type a*. If the address l is allocated, the function accessMem (E.M) l returns l'(b,e,k,la) the value of that pointer l' and its metadata b, e, k, and la. Because pointers can be out-of-bounds due to pointer arithmetic, before doing the dereference, these rules check whether l' is within the bounds, and yield the error Abort when the bounds check fails. The checking of E.RI la = k ensures the memory pointed by this pointer is not deallocated or reallocated. Note that it is a memory violation if accessMem (E.M) l = None, in which case neither rule above applies. However, according to the type safety properties described later, accessMem will not fail at runtime.

As another example, the operational semantics performs bounds checking for access through a struct field using the following rule:

     E |- lhs -->* l : s*
     [accessMem] (E.M) l = l'_(b,e,k,la)
     b <= l' <= e - sizeof(s)
     [getStructOffSet] s id = offset
     [getStructType] s id = t'
     E.RI la = k
     ----------------------------------------------------------------------- :: S_StructPos
     E |- lhs->id -->* (l'+offset) : a


Here, getStructOffSet returns a field offset, which is less than or equal to sizeof(s), along with the atomic type a of the field in struct type s.

Other rules keep track of pointers' values and their metadata when evaluating pointer arithmetic and type casts:

     E |- ptr -->* l_(b,e,k,la) : p* -| E'
     E' |- i -->* n_(b',e',k',la') : int -| E''
     l' = l + n * sizeof(p)
     --------------------------------------------------------------------- :: S_Add_Ptr_Int
     E |- ptr+i -->* l'_(b,e,ca,ci) :p* -| E'
     E |- rhs -->* v_(b,e,k,la) : a -| E'
     (b', e', k', la') =  (a == int) ? (0,0,0,0) : (b,e,k,la)
     ---------------------------------------------------------- :: S_Cast
     E |- (a')rhs -->* v_(b',e',k',la') : a' -| E'


As shown above, the metadata on the results of a pointer arithmetic operation is just the metadata associated with the original pointer. Casts also propagate the metadata unchanged, except in the case of casting an integer to a pointer, in which case the base and bound are both set to zero. Subsequent bounds checks on the resulting pointers will fail, ensuring that pointers created from integers cannot be dereferenced by default. This rule follows the CETS implementation; as a consequence, when casting an integer to a pointer, the only sound choice is to zero-out the metadata. This approach is conservative, but in practice C programs rarely cast from pointers to integers and then back again, which is the case that might benefit from more accurate metadata propagation.

At runtime pointers can be temporarily out-of-bounds or with an inconsistent key. The operational semantics will not yield an error in such cases until the program attempts a dereference through the illegal pointer. Once an assertion check fails or a SystemError occurs, the rules propagate memory errors to the top level, analogously to raising an exception:

     E |- c1 -->* r -| E'
     r is [Abort] or [SystemError]
     -------------------------------- :: S_Seq_ErrorProp1
     E |- c1;c2 -->* r -| E'
     
     E |- c1 -->* [ROK] -| E'
     E' |-  c2 -->* r -| E''
     r is [RAbort] or [SystemError]
     -------------------------------- :: S_Seq_ErrorProp2
     E |- c1;c2 -->* r -| E''

Inductive s_lhs : Env -> c_lhs -> Result -> AType -> Prop :=
  
  | S_GVar : forall (E:Env) (v: Var) (l: nat) (t: AType),
     (Stack2frame (stack E)) v = None ->
     lookUpGlobal E.(global) v = Some (l, t) ->
     wf_AType t ->
     s_lhs E (C_Var v) (RLoc (l, 1, gri)) t

  | S_SVar : forall (E:Env) (v: Var) (l: nat) (t: AType) r ri,
     lookUpStack E.(stack) v = Some (l, t, r, ri) ->
     wf_AType t ->
     s_lhs E (C_Var v) (RLoc (l, r, ri)) t
  
  | S_Deref : forall E lhs loc r ri be' loc' r' ri' t,
     s_lhs E lhs (RLoc (loc, r, ri)) (A_Pointer (P_AType t)) ->
     accessMemMeta E.(mem) E.(mmeta) loc = Some (loc', (be', r', ri')) ->
     assertion_ptr loc' (be', r', ri') (A_Pointer (P_AType t)) E.(revInfo) ->
     s_lhs E (C_Deref lhs) (RLoc (loc', r', ri')) t
  | S_Deref_ErrorProp : forall E lhs R t t',
     s_lhs E lhs R t -> Error R ->
     s_lhs E (C_Deref lhs) R t'
  | S_Deref_Abort : forall E lhs loc r ri be' loc' t,
     s_lhs E lhs (RLoc (loc, r, ri)) (A_Pointer (P_AType t)) ->
     accessMemMeta E.(mem) E.(mmeta) loc = Some (loc', be') ->
     ~assertion_ptr loc' be' (A_Pointer (P_AType t)) E.(revInfo) ->
     s_lhs E (C_Deref lhs) Abort t
  
  | S_StructPos : forall E lhs id loc r ri s t' offset be' r' ri' loc',
     s_lhs E lhs (RLoc (loc, r, ri)) (A_Pointer (P_Struct s)) ->
     accessMemMeta E.(mem) E.(mmeta) loc = Some (loc', (be', r', ri')) ->
     getStructOffset s id = Some offset ->
     getStructType s id = Some t' ->
     assertion_ptr loc' (be', r', ri') (A_Pointer (P_Struct s)) E.(revInfo) ->
     s_lhs E (C_StructPos lhs id) (RLoc (loc'+offset, r', ri')) t'
  | S_StructPos_ErrorProp : forall E lhs id R t t',
     s_lhs E lhs R t -> Error R ->
     s_lhs E (C_StructPos lhs id) R t'
  | S_StructPos_Abort : forall E lhs id loc r ri s t' offset be' r' ri' loc',
     s_lhs E lhs (RLoc (loc, r, ri)) (A_Pointer (P_Struct s)) ->
     accessMemMeta E.(mem) E.(mmeta) loc = Some (loc', (be', r', ri')) ->
     getStructOffset s id = Some offset ->
     getStructType s id = Some t' ->
     ~ assertion_ptr loc' (be', r', ri') (A_Pointer (P_Struct s)) E.(revInfo)->
     s_lhs E (C_StructPos lhs id) Abort t'
  
  | S_NamePos : forall E lhs id loc r ri n s t' offset be' r' ri' loc',
     s_lhs E lhs (RLoc (loc, r, ri)) (A_Pointer (P_Name n)) ->
     accessMemMeta E.(mem) E.(mmeta) loc = Some (loc', (be', r', ri')) ->
     typeTable n = Some s ->
     getStructOffset s id = Some offset ->
     getStructType s id = Some t' ->
     assertion_ptr loc' (be', r', ri') (A_Pointer (P_Name n)) E.(revInfo) ->
     s_lhs E (C_NamePos lhs id) (RLoc (loc'+offset, r', ri')) t'
  | S_NamePos_ErrorProp : forall E lhs id R t t',
     s_lhs E lhs R t -> Error R ->
     s_lhs E (C_NamePos lhs id) R t'
  | S_NamePos_Abort : forall E lhs id loc r ri n s t' offset be' r' ri' loc',
     s_lhs E lhs (RLoc (loc, r, ri)) (A_Pointer (P_Name n)) ->
     accessMemMeta E.(mem) E.(mmeta) loc = Some (loc', (be', r', ri')) ->
     typeTable n = Some s ->
     getStructOffset s id = Some offset ->
     getStructType s id = Some t' ->
     ~ assertion_ptr loc' (be', r', ri') (A_Pointer (P_Name n)) E.(revInfo) ->
     s_lhs E (C_NamePos lhs id) Abort t'
  .

Inductive s_rhs : Env -> c_rhs -> Result -> AType -> Env -> Prop :=
  
  | S_Const : forall E n,
     s_rhs E (C_Const n) (RVal (n, (0,0,0,0))) A_Int E
  
  | S_Lhs : forall E lhs loc r ri t ds,
     s_lhs E lhs (RLoc (loc, r, ri)) t ->
     accessMemMeta E.(mem) E.(mmeta) loc = Some ds->
     s_rhs E (C_Lhs lhs) (RVal ds) t E
  | S_Lhs_ErrorProp : forall E lhs R t t',
     s_lhs E lhs R t -> Error R ->
     s_rhs E (C_Lhs lhs) R t' E
  
  | S_Ref : forall E lhs loc r ri t,
     s_lhs E lhs (RLoc (loc, r, ri)) t ->
     wf_AType (A_Pointer (P_AType t)) ->
     s_rhs E (C_Ref (P_AType t) lhs)
                   (RVal
                      ((loc, (loc, loc+sizeOfAType t, r, ri))))
                   (A_Pointer (P_AType t)) E
  | S_Ref_ErrorProp : forall E lhs R t,
     s_lhs E lhs R t -> Error R ->
     s_rhs E (C_Ref (P_AType t) lhs) R (A_Pointer (P_AType t)) E
  
  | S_Add_Int_Int : forall E E' E'' rhs1 rhs2 n1 n2 be1 be2,
     s_rhs E rhs1 (RVal (n1, be1)) A_Int E' ->
     s_rhs E' rhs2 (RVal (n2, be2)) A_Int E'' ->
     s_rhs E (C_Add rhs1 rhs2)
                   (RVal (n1+n2, (0,0,0,0)))
                   A_Int E''
  | S_Add_Int_Int_ErrorProp1 : forall E E' rhs1 R t rhs2 t',
     s_rhs E rhs1 R t E' -> Error R ->
     s_rhs E (C_Add rhs1 rhs2) R t' E'
  | S_Add_Int_Int_ErrorProp2 : forall E E' E'' rhs1 ds1 rhs2 R t t',
     s_rhs E rhs1 (RVal ds1) A_Int E' ->
     s_rhs E' rhs2 R t E'' -> Error R ->
     s_rhs E (C_Add rhs1 rhs2) R t' E''
  
  | S_Add_Ptr_Int : forall E E' E'' rhs1 rhs2 p n1 n2 be1 be2 size,
     s_rhs E rhs1 (RVal (n1, be1)) (A_Pointer p) E' ->
     s_rhs E' rhs2 (RVal (n2, be2)) A_Int E'' ->
     wf_AType (A_Pointer p) ->
     sizeOfPType p = Some size ->
     s_rhs E (C_Add rhs1 rhs2)
                   (RVal (n1+n2*size, be1))
                   (A_Pointer p) E''
  | S_Add_Ptr_Int_ErrorProp1 : forall E E' rhs1 R t rhs2 t',
     s_rhs E rhs1 R t E' -> Error R ->
     s_rhs E (C_Add rhs1 rhs2) R t' E'
  | S_Add_Ptr_Int_ErrorProp2 : forall E E' E'' rhs1 ds1 rhs2 R p t' t'',
     s_rhs E rhs1 (RVal ds1) (A_Pointer p) E' ->
     s_rhs E' rhs2 R t' E'' -> Error R ->
     s_rhs E (C_Add rhs1 rhs2) R t'' E''
  
  | S_Cast : forall E E' rhs ds t t' d,
     s_rhs E rhs (RVal ds) t E' ->
     wf_AType t' ->
     dataCast t' ds t = d ->
     s_rhs E (C_Cast t' rhs) (RVal d) t' E'
  | S_Cast_ErrorProp : forall E E' rhs R t t' t'',
     s_rhs E rhs R t E' -> Error R ->
     s_rhs E (C_Cast t' rhs) R t'' E'
  
  | S_Size : forall E t size,
     wf_PType t ->
     sizeOfPType t = Some size ->
     s_rhs E (C_Size t) (RVal (size, (0,0,0,0))) A_Int E
  
  | S_Alloc : forall E E' E'' rhs loc ri n be p size,
    s_rhs E rhs (RVal (n, be)) A_Int E'->
    wf_AType (A_Pointer p) ->
    sizeOfPType p = Some size ->
    size > 0 ->
    malloc E' n = Some (E'', loc, ri)->
    s_rhs E (C_Alloc p rhs)
                  (RVal (loc, (loc, loc+n, E'.(rev), ri)))
                  (A_Pointer p)
                  (MkEnv E''.(mem) E''.(mmeta) E''.(amb) E''.(global) E''.(stack) E''.(revInfo) E''.(rev))
  | S_Alloc_ErrorProp : forall E E' rhs R t t' p,
    s_rhs E rhs R t E'-> Error R ->
    wf_AType (A_Pointer p) ->
    s_rhs E (C_Alloc p rhs) R t' E'
  | S_Alloc_OutofMem : forall E E' rhs n be t' p size,
    s_rhs E rhs (RVal (n, be)) A_Int E'->
    wf_AType (A_Pointer p) ->
    sizeOfPType p = Some size ->
    size > 0 ->
    malloc E' n = None->
    s_rhs E (C_Alloc p rhs) SystemError t' E'
  .

Inductive s_cmd : Env -> c_cmd -> Result-> Env->Prop :=
  
  | S_Skip : forall E,
    s_cmd E (C_Skip) ROK E
  
  | S_Seq : forall E E' E'' c1 c2,
    s_cmd E c1 ROK E' ->
    s_cmd E' c2 ROK E'' ->
    s_cmd E (C_Seq c1 c2) ROK E''
  | S_Seq_ErrorProp1 : forall E E' R c1 c2,
    s_cmd E c1 R E' -> Error R->
    s_cmd E (C_Seq c1 c2) R E'
  | S_Seq_ErrorProp2 : forall E E' E'' c1 c2 R,
    s_cmd E c1 ROK E' ->
    s_cmd E' c2 R E'' -> Error R->
    s_cmd E (C_Seq c1 c2) R E''
  
  | S_Assign_Ptr : forall E E' lhs rhs loc r ri tl ds tr M'' MM'' d,
    s_lhs E lhs (RLoc (loc, r, ri)) tl ->
    s_rhs E rhs (RVal ds) tr E' ->
    isPtr tl ->
    dataCast tl ds tr = d ->
    storeMemMeta E'.(mem) E'.(mmeta) loc d = Some (M'', MM'') ->
    s_cmd E (C_Assign lhs rhs) ROK (MkEnv M'' MM'' E'.(amb) E'.(global) E'.(stack) E'.(revInfo) E'.(rev))
  | S_Assign_NPtr : forall E E' lhs rhs loc r ri tl ds tr M'' d,
    s_lhs E lhs (RLoc (loc, r, ri)) tl ->
    s_rhs E rhs (RVal ds) tr E' ->
    ~isPtr tl ->
    dataCast tl ds tr = d ->
    storeMem E'.(mem) loc (fst d) = Some M''->
    s_cmd E (C_Assign lhs rhs) ROK (MkEnv M'' E'.(mmeta) E'.(amb) E'.(global) E'.(stack) E'.(revInfo) E'.(rev))
  | S_Assign_ErrorProp1 : forall E lhs rhs tl R,
    s_lhs E lhs R tl -> Error R->
    s_cmd E (C_Assign lhs rhs) R E
  | S_Assign_ErrorProp2 : forall E E' lhs rhs loc tl R tr,
    s_lhs E lhs (RLoc loc) tl ->
    s_rhs E rhs R tr E' -> Error R->
    s_cmd E (C_Assign lhs rhs) R E'

  | S_Free : forall E lhs loc r ri p E',
    s_lhs E lhs (RLoc (loc, r, ri)) (A_Pointer p) ->
    free E loc = Some E' ->
    s_cmd E (C_Free lhs) ROK E'
  | S_Free_ErrorProp : forall E lhs tl R,
    s_lhs E lhs R tl -> Error R->
    s_cmd E (C_Free lhs) R E
  | S_Free_Error : forall E lhs loc r ri p,
    s_lhs E lhs (RLoc (loc, r, ri)) (A_Pointer p) ->
    free E loc = None ->
    s_cmd E (C_Free lhs) SystemError E
  | S_Call : forall E n f c R E' E'' E''',
    funTable n = Some (f, c) ->
    createFrame E f = Some E' ->
    s_cmd E' c R E'' ->
    ~ Error R ->
    destroyFrame E'' = Some E''' ->
    s_cmd E (C_Call n) R E'''
  | S_Call_ErrorProp : forall E n f c R E' E'',
    funTable n = Some (f, c) ->
    createFrame E f = Some E' ->
    s_cmd E' c R E'' ->
    Error R ->
    s_cmd E (C_Call n) R E''
  | S_Call_Error1 : forall E n f c,
    funTable n = Some (f, c) ->
    createFrame E f = None ->
    s_cmd E (C_Call n) SystemError E
  | S_Call_Error2 : forall E n f c R E' E'',
    funTable n = Some (f, c) ->
    createFrame E f = Some E' ->
    s_cmd E' c R E'' ->
    ~ Error R ->
    destroyFrame E'' = None ->
    s_cmd E (C_Call n) SystemError E''
  .