Sugar project
- Demo for pure untyped lambda calculus
- Grammar for simply-typed lambda calculus: lambda.s
- Grammar for a subset of Java: java.s
- Grammar for Apollo the security-typed language:
apollo.s
- Grammar for Sugar itself: sugar.s
- Package download: sugar-0.1.tgz
- Requirements and resources
Sugar: a concrete syntax grammar for generating scanners, parsers,
printers and abstract-syntax definitions.
The novelty is a compact notation for precedence, associativity and
repetition that can easily be translated to the standard LALR grammar
of YACC. This tool also automatically keeps track line and column
numbers and diagnoses parse errors.
Brief introduction to the concrete syntax grammar:
- each rule is labelled with the name of its type constructor (App: e e)
- concrete tokens are enclosed in single quotes ('fun' id '.' e)
- options and repetitions are specified using
- question mark ? for an optional token
- star * for zero or more such tokens
- plus + for one or more such tokens
- star star and literal ** ',' for zero or more with separator
- plus plus and literal ++ ',' for one or more with separator
- builtin lexical classes: id, num, str (c-style identifiers, numbers, strings)
----------------------------------------
Quick Start
----------------------------------------
> tar zxvf ocaml-3.07.2.tgz # and install...
> tar zxvf sugar-0.1.tgz
> cd sugar-0.1
> make
> make jfull
----------------------------------------
Outline
----------------------------------------
- example (pure.s and p*)
- grammar rules (sugar.s)
- grammar items (sugar.s)
- location info (pure.s and psugar.ml)
- error diagnosis (lambda.s and lparse.mly)
- challenges (sugar.ml)
- resources
pure.s pure.ml pure.src
Skeleton interpreter for pure lambda calculus
lambda.s lambda.ml lambda.src
Skeleton interpreter for extended lambda calculus
java.s java.ml java.src
Skeleton interpreter for Java
apollo.s apollo.ml apollo.src
Skeleton interpreter for Apollo [2] the security-typed language
sugar.s sugar.ml sugar.src
Interpreter for Sugar [1] the concrete syntax grammar
----------------------------------------
Example
----------------------------------------
1. Concrete-syntax definition (pure.s)
e:
> Fun: 'fun' id ++ ',' '.' e
< App: e e
= Atom: '(' e ')'
| Id: id
;
2. Abstract-syntax definition (psugar.ml)
and e =
| Fun of id list * e
| App of e * e
| Atom of e
| Id of id
2. Scanner definition (pscan.mll)
| "fun" { Kfun (info lexbuf) }
| "," { Comma (info lexbuf) }
| "." { Dot (info lexbuf) }
3. Parser definition (pparse.mly)
e: e1 { $1 }
e1:
| e2 { $1 }
| Kfun id_plus_comma Dot e1 { Fun ($1,$2,$4) }
e2:
| e3 { $1 }
| e2 e3 { App ($1,$1,$2) }
e3:
| Lparen e Rparen { Atom ($1,$2) }
| id { Id ($1,$1) }
id_plus_comma:
| id { [$1] }
| id Comma id_plus_comma { $1 :: $3 }
4. Printer definition (pprint.ml)
and se = function
| Fun (id2,e3) -> sf "fun %s . %s" (slist sid "," id2) (se e3)
| App (e2,e3) -> sf "%s %s" (se e2) (se e3)
| Atom (e) -> sf "( %s )" (se e)
| Id (id) -> sf "%s" id
5. TeX definition (lsugar.tex)
\[\begin{array}{rcll}
\N{e} &::>& \verb!fun!\ \verb!id!^{\N{+}}_{\verb!,!}\ \verb!.!\ \verb!e!
& \text{(Fun)} \\
&::<& \verb!e!\ \verb!e! & \text{(App)} \\
&::=& \verb!(!\ \verb!e!\ \verb!)! & \text{(Atom)} \\
&& \verb!id! & \text{(Id)}
\end{array}\]
----------------------------------------
Grammar rules (sugar.s)
----------------------------------------
1. Alias: id ':' item+ ';'
2. Union: id ':' group+ ';'
group: assoc case ++ '|';
case: id ':' item*;
----------------------------------------
Grammar items (sugar.s)
----------------------------------------
1. Literal string (Literal): literal
2. Rule identifier (Id): id
3. Literal mark (Mark): literal '?'
4. Rule option (Option): id '?'
5. Star repetition (Star): id '*'
6. Plus repetition (Plus): id '+'
7. Star repetition with separator (Sstar): id '**' literal
8. Plus repetition with separator (Pplus): id '++' literal
----------------------------------------
Location info (pure.s and psugar.ml)
----------------------------------------
Each syntax type is extended with location info:
e:
> Fun: 'fun' id ++ ',' '.' e
< App: e e
= Atom: '(' e ')'
| Id: id
;
--->
and e =
| Fun of z * id list * e
| App of z * e * e
| Atom of z * e
| Id of z * id
where z is of type "string", representing
type info = (filename, linenum, colnum)
----------------------------------------
Error diagnosis (lambda.s and lparse.mly)
----------------------------------------
Each syntax rule is extended with error diagnosis:
e:
> If: 'if' e 'then' e 'else' e
--->
e:
....
| Kif error { abort $1 [("If: 'if' e 'then' e 'else' e",8)] }
| Kif e error { abort $1 [("If: 'if' e 'then' e 'else' e",10)] }
| Kif e Kthen error { abort $3 [("If: 'if' e 'then' e 'else' e",17)] }
| Kif e Kthen e error { abort $3 [("If: 'if' e 'then' e 'else' e",19)] }
| Kif e Kthen e Kelse error { abort $5 [("If: 'if' e 'then' e 'else' e",26)] }
Such that if the input file contains just the keyword "if", the
compiler complains about parse error and shows diagnosis of the prefix
and the full rule that the input violates.
test.src:05:01: Parse error
prefix: If: 'if'
rule : If: 'if' e 'then' e 'else' e
----------------------------------------
Challenges
----------------------------------------
- support full java grammar
- generate template for typing and evaluation
- support hash consing for c/java
- output for lex/yacc of haskell, c/java (nominal)
- support lexical grammar to specify whitespace, comment, newline...
- more static check (type system for sugar?):
- duplicate rules
- empty rules
- occurence-check
- no left recursion without a left non-terminal
----------------------------------------
Resources
----------------------------------------
[1] Sugar: a concrete syntax grammar
http://www.cis.upenn.edu/~stse/sugar
[2] Apollo: System F with subtyping and information flow
http://www.cis.upenn.edu/~stse/apollo
[3] Ordered Context-Free Grammars, Ziemowit Laski
http://www.ziemas.net/download/ocfg_tr.pdf
[4] Labelled BNF grammar
http://www.cs.chalmers.se/~markus/BNFC/
Stephen Tse
July 30, 2004
Last modified: Sat Jul 31 09:43:19 EDT 2004