CIT 594 Assignment 4: Recognizer for Logo Grammar
Spring 2010, David Matuszek


General Idea

In this assignment I will define, using BNF, a small programming language. This language is a simplified and modified version of the language Logo.

Your assignment is to write a recognizer for programs in that language. A recognizer is a program that, given a stream of tokens, decides whether or not that stream of tokens constitutes a grammatically correct program in the language. The only result of a recognizer is true if the input is grammatically correct, and false if it is not. The “meaning” of the program, if any, is outside the scope of this assignment.

Supplied files


Here is the grammar you will be using. Terminals are quoted and black, to help distinguish them from the EBNF nonterminals and metasymbols. Reminder:|” means “or”, “[]” means optional, and “{}” means zero or more.

<program> ::= <command> { <command> } { <procedure> }     

<command> ::= <move> <expression> <eol>
            | "penup" <eol>
            | "pendown" <eol>
            | "color" <color> <eol>
            | "home" <eol>
            | "jump" <expression> <expression> <eol>
            | "set" <variable> <expression> <eol>
            | "repeat" <expression> <block>
            | "while" <condition> <block>
            | "if" <condition> <block> [ "else" <block> ]
            | "do" <name> { <expression> } <eol>
<move> ::= "forward"
         | "right"
         | "left"
         | "face"

<color> ::= "red" | "orange" | "yellow" | "green" | "blue"
          | "purple" | "violet" | "brown" | "black" | "white"
          | "gray" | "pink" | "rgb" <expression> <expression> <expression>

<block> ::= "{" <eol> { <command> } "}" <eol>

<condition> ::= <expression> <comparator> <expression>

<comparator> ::= "<" | "=" | ">"

<procedure> ::= "to" <name> { <variable> } <eol> { <command> }  "end" <eol>

<variable> ::= <name>

<eol> ::= <newline_character> { <newline_character> }

// I have already done the following rules for you:       
<expression> ::= [ <add_operator> ] <term> { <add_operator> <term> }

<term> ::= <factor> { <multiply_operator> <factor> }

<factor> ::= <name>
           | <number>
           | "x"
           | "y"
           | "(" <expression> ")"

<add_operator> ::= "+" | "-"

<multiply_operator> ::= "*"| "/"

// The following very basic definitions should be done by your Tokenizer,
// but are included here for completeness.

<name> ::= Any sequence of letters, digits, and underscores, beginning with a letter.

<number> ::= A literal number, either integer or double.

<newline_character> ::= "\n" 

The Tokenizer

The input to your Recognizer should be a sequence of tokens--words, numbers, mathematical symbols, etc. The Sun-supplied StreamTokenizer does almost everything you want--it seems to be designed for tokenizing C, C++, and Java programs. However, I think the interface to StreamTokenizer is just plain ugly, so I've written some façade classes for it. Here are brief descriptions of those classes; if you would like more detail, read the source code. ("Use the source, Luke!")

The TokenType class

The TokenType class is a simple enum which just lists the possible kinds of tokens:

The Token class

This class supplies the following methods; getType() (returns a TokenType), getValue() (returns a String), toString(), equals(Object), and hashCode().

The Tokenizer class

There are two constructors, Tokenizer(Reader) and Tokenizer(String); use whichever one you like.

There are two methods, next() to return the next Token, and pushBack() to return a Token (so that the next call to next() will return it again.) However, my supplied next() method does not distinguish between a NAME and a KEYWORD; it is your job to fix this by adding a HashSet of keywords, and testing each "name" against this set.

The SyntaxException class

Just a trivial class to give a good name to the kind of exceptions we have to deal with. Implements the three usual constructors for Exception classes, but that's all.

The test classes

Should be self explanatory. AllTests runs all tests listed in it.


Recommended approach

To get started, do the following:

  1. Download my files and Import... into a new project in Eclipse (but don't import Also Import... the Tokenizer files.
  2. To my Recognizer class, add stubs for each method you will need. This will be one method stub for every nonterminal (<command>, <move>, etc.). Note that you do not need method stubs for <name> or <number>, because your tokenizer provides these.
  3. Use Eclipse to generate a TestCase for the Recognizer class. Have it generate all the test method stubs for you.
  4. Copy my methods from my file into yours. This may involve replacing some constructors or method stubs.
  5. Create an test suite.
  6. Run JUnit to ensure everything is set up correctly.

Here's why I suggest the above. If you start by importing my class, Eclipse won't add test method stubs to it (or if it will, I haven't yet discovered how). This would mean you would have to write all the test method stubs by hand, which is boring and error prone. If instead you let Eclipse generate the class, it will be correct and complete, and you can add my code simply by cutting and pasting.

I believe you will find that the easiest way to do this assignment is as follows:

  1. Enhance the Tokenizer to distinguish keywords from other names.
  2. Pick some test method stub in and fill it with actual tests. This should be some test that doesn't depend on code you haven't yet written. For example, <comparator> doesn't depend on anything else, but <condition> depends on <comparator>, so write the test code for <comparator> first.
  3. Run the test. It should fail. Make certain that it does.
  4. Write the code for the method you are testing (that is, fill in the method stub in the Recognizer class).
  5. Test and debug, test and debug, until your test passes. If you wrote adequate tests, you should be done with this method. If your tests pass and you know the method isn't finished, add a test that fails.
  6. Repeat until all methods have been coded.

Due date

Monday, February 22, before midnight. Put your files in a single directory, including the Javadoc files, and zip up that directory.Turn in your program via Blackboard.