Sugar project

  1. Demo for pure untyped lambda calculus
  2. Grammar for simply-typed lambda calculus: lambda.s
  3. Grammar for a subset of Java: java.s
  4. Grammar for Apollo the security-typed language: apollo.s
  5. Grammar for Sugar itself: sugar.s

  6. Package download: sugar-0.1.tgz
  7. Requirements and resources
 

Sugar: a concrete syntax grammar for generating scanners, parsers,
printers and abstract-syntax definitions. 

The novelty is a compact notation for precedence, associativity and
repetition that can easily be translated to the standard LALR grammar
of YACC. This tool also automatically keeps track line and column
numbers and diagnoses parse errors.

Brief introduction to the concrete syntax grammar: 
- each rule is labelled with the name of its type constructor (App: e e)
- concrete tokens are enclosed in single quotes ('fun' id '.' e)
- options and repetitions are specified using
  - question mark ? for an optional token
  - star * for zero or more such tokens
  - plus + for one or more such tokens
  - star star and literal ** ',' for zero or more with separator
  - plus plus and literal ++ ',' for one or more with separator
- builtin lexical classes: id, num, str (c-style identifiers, numbers, strings)


----------------------------------------
Quick Start
----------------------------------------
> tar zxvf ocaml-3.07.2.tgz  # and install...

> tar zxvf sugar-0.1.tgz
> cd sugar-0.1
> make
> make jfull


----------------------------------------
Outline
----------------------------------------

- example (pure.s and p*)
- grammar rules (sugar.s)
- grammar items (sugar.s)
- location info (pure.s and psugar.ml)
- error diagnosis (lambda.s and lparse.mly)
- challenges (sugar.ml)
- resources


pure.s pure.ml pure.src
  Skeleton interpreter for pure lambda calculus

lambda.s lambda.ml lambda.src
  Skeleton interpreter for extended lambda calculus

java.s java.ml java.src
  Skeleton interpreter for Java

apollo.s apollo.ml apollo.src
  Skeleton interpreter for Apollo [2] the security-typed language

sugar.s sugar.ml sugar.src
  Interpreter for Sugar [1] the concrete syntax grammar


----------------------------------------
Example
----------------------------------------

1. Concrete-syntax definition (pure.s)

   e:
   > Fun: 'fun' id ++ ',' '.' e
   < App: e e
   =  Atom: '(' e ')'
   | Id: id
   ;
    
2. Abstract-syntax definition (psugar.ml)

   and e =
   | Fun of id list * e
   | App of e * e
   | Atom of e
   | Id of id
      
2. Scanner definition (pscan.mll)

   | "fun" { Kfun (info lexbuf) }
   | "," { Comma (info lexbuf) }
   | "." { Dot (info lexbuf) }
    

3. Parser definition (pparse.mly)

   e: e1 { $1 }
   
   e1:
   | e2 { $1 }
   | Kfun id_plus_comma Dot e1 { Fun ($1,$2,$4) }
   
   e2:
   | e3 { $1 }
   | e2 e3 { App ($1,$1,$2) }
   
   e3:
   | Lparen e Rparen { Atom ($1,$2) }
   | id { Id ($1,$1) }
   
   id_plus_comma:
   | id { [$1] }
   | id Comma id_plus_comma { $1 :: $3 }
    
4. Printer definition (pprint.ml)

   and se = function
   | Fun (id2,e3) -> sf "fun %s . %s" (slist sid "," id2) (se e3)
   | App (e2,e3) -> sf "%s %s" (se e2) (se e3)
   | Atom (e) -> sf "( %s )" (se e)
   | Id (id) -> sf "%s" id
    
5. TeX definition (lsugar.tex)

   \[\begin{array}{rcll}
   \N{e} &::>& \verb!fun!\ \verb!id!^{\N{+}}_{\verb!,!}\ \verb!.!\ \verb!e! 
     & \text{(Fun)} \\
   &::<& \verb!e!\ \verb!e! & \text{(App)} \\
   &::=& \verb!(!\ \verb!e!\ \verb!)! & \text{(Atom)} \\
   && \verb!id! & \text{(Id)}
   
   \end{array}\]
   

----------------------------------------
Grammar rules (sugar.s)
----------------------------------------

1. Alias: id ':' item+ ';'

2. Union: id ':' group+ ';'


group: assoc case ++ '|';
case: id ':' item*;


----------------------------------------
Grammar items (sugar.s)
----------------------------------------

1. Literal string (Literal): literal

2. Rule identifier (Id): id

3. Literal mark (Mark): literal '?'

4. Rule option (Option): id '?'

5. Star repetition (Star): id '*'

6. Plus repetition (Plus): id '+'

7. Star repetition with separator (Sstar): id '**' literal

8. Plus repetition with separator (Pplus): id '++' literal


----------------------------------------
Location info (pure.s and psugar.ml)
----------------------------------------

Each syntax type is extended with location info:

   e:
   > Fun: 'fun' id ++ ',' '.' e
   < App: e e
   =  Atom: '(' e ')'
   | Id: id
   ;

--->
   
   and e =
   | Fun of z * id list * e
   | App of z * e * e
   | Atom of z * e
   | Id of z * id
 
where z is of type "string", representing

  type info = (filename, linenum, colnum)


----------------------------------------
Error diagnosis (lambda.s and lparse.mly)
----------------------------------------

Each syntax rule is extended with error diagnosis:

   e:
   > If: 'if' e 'then' e 'else' e

---> 
   
   e:
   ....
   | Kif error { abort $1 [("If: 'if' e 'then' e 'else' e",8)] }
   | Kif e error { abort $1 [("If: 'if' e 'then' e 'else' e",10)] }
   | Kif e Kthen error { abort $3 [("If: 'if' e 'then' e 'else' e",17)] }
   | Kif e Kthen e error { abort $3 [("If: 'if' e 'then' e 'else' e",19)] }
   | Kif e Kthen e Kelse error { abort $5 [("If: 'if' e 'then' e 'else' e",26)] } 


Such that if the input file contains just the keyword "if", the
compiler complains about parse error and shows diagnosis of the prefix
and the full rule that the input violates.

test.src:05:01: Parse error
  prefix:  If: 'if'
  rule  :  If: 'if' e 'then' e 'else' e



----------------------------------------
Challenges
----------------------------------------
- support full java grammar
- generate template for typing and evaluation
- support hash consing for c/java
- output for lex/yacc of haskell, c/java (nominal)
- support lexical grammar to specify whitespace, comment, newline...
- more static check (type system for sugar?): 
  - duplicate rules
  - empty rules
  - occurence-check
  - no left recursion without a left non-terminal


----------------------------------------
Resources
----------------------------------------

[1] Sugar: a concrete syntax grammar 
    http://www.cis.upenn.edu/~stse/sugar

[2] Apollo: System F with subtyping and information flow
    http://www.cis.upenn.edu/~stse/apollo

[3] Ordered Context-Free Grammars, Ziemowit Laski 
    http://www.ziemas.net/download/ocfg_tr.pdf

[4] Labelled BNF grammar
    http://www.cs.chalmers.se/~markus/BNFC/

Stephen Tse 
July 30, 2004


Last modified: Sat Jul 31 09:43:19 EDT 2004