CIT 594 “RoboTalk” Recognizer
Spring 2007, David Matuszek


General Idea:

In this assignment I define a BNF grammar for a small programming language, which I am calling "RoboTalk."

Your assignment is to write a recognizer (class Recognizer) for productions (rules) in that grammar. A recognizer is a program that, given a stream of tokens and a nonterminal, decides whether or not that stream of tokens is an instance of that nonterminal. The meaning of the production, if any, is outside the scope of this assignment.

There will be no "main program." Instead, you should have JUnit tests for every nonterminal, and every definition of that nonterminal, defined in the grammar. When your test methods can accept every correct input, and reject every incorrect input (either by reporting false or throwing an exception), you are done.

For example, one of the nonterminals is <factor>, defined by

<factor> ::= <NAME> | <NUMBER> | "(" <expression> ")"

The stream of tokens SYMBOL:(  NUMBER:2  SYMBOL:+  NUMBER:3  SYMBOL:) (the tokenization of the String "(2+3)") is recognized by this production.

There is a nonterminal called <program>. For this assignment, there is nothing special about this nonterminal.

Supplied files:

This is a very simple class. It defines an enum Token.Type with values Token.Type.NAME, Token.Type.NUMBER, Token.Type.SYMBOL, Token.Type.EOL, and Token.Type.EOF; these denote words, integers, punctuation marks, end-of-line, and end-of-file, respectively. It has two fields, Token.Type type and String value which, for easy access, are not private.

This is starter code for your assignment. It recognizes <expression>, <factor>, <term>, <add operator>, and <multiply operator>. It has some helper classes which you should get familiar with (generate and read the Javadoc!), and, most importantly, it has methods Token nextToken() and pushBack(), which hide the ugly details of using a StreamTokenizer.

More starter code.

This is just a RuntimeException with a more specific name.



The complete grammar is on a separate page for easy reference.

For each nonterminal nt in the grammar (excluding <NAME>, <NUMBER>, <SYMBOL>, and <EOL>), you should have one public boolean isNt() method. Name your methods the same as the nonterminals they recognize, but with the is- prefix. See for examples, which has the required methods for <expression>, <term>, <factor>, <addOperator>, and <multiplyOperator>, plus some very useful (private) utility methods.

Provide complete JUnit tests for all public methods. Write Javadoc comments for all non-private methods.

Caution: Some definitions contains braces { and } as metasymbols, and <block> also uses "{" and "}" as terminals. Don't get these confused!

Recommended approach:

To get started, do the following:

  1. Download my files and Import... all but into a new project in Eclipse. Also Import... the files you wrote for the Tokenizer assignment.
  2. To my Recognizer class, add stubs for each method you will need. This will be one method stub for every nonterminal (<command>, <move>, etc.). Note that you do not need method stubs for <NAME>, <NUMBER>, <SYMBOL>, or <EOL>, because your tokenizer provides these. You also don't need stubs for my provided methods.
  3. Use Eclipse to generate a TestCase for the Recognizer class. Have it generate all the test method stubs for you.
  4. Copy my methods from my file into yours. This may involve replacing some constructors or method stubs.
  5. Run JUnit to ensure everything is set up correctly.

Here's why I suggest the above. If you start by importing my class, Eclipse won't add test method stubs to it (or if it will, I haven't yet discovered how). This would mean you would have to write all the test method stubs by hand, which is boring and error prone. If instead you let Eclipse generate the class, it will be correct and complete, and you can add code from the provided class simply by cutting and pasting.

The easiest way to do this assignment is as follows:

  1. Pick some unwritten method that doesn't depend on other unwritten methods, and turn the test method stub in into an actual test. For example, <comparator> doesn't depend on anything else, but <condition> depends on <comparator>, so write the test code for <comparator> first.
  2. Run the test. It should fail. Make certain that it does. (If it doesn't, there may be a problem with your test.)
  3. Write the code for the method you are testing (that is, fill in the method stub in the Recognizer class).
  4. Test and debug, test and debug, until your test passes. If you wrote adequate tests, you should be done with this method. If your tests pass and you know the method isn't finished, add a test that fails.
  5. Repeat until all methods have been coded.

Due date:

Tuesday, February 20, before midnight. Turn in your program via blackboard, and follow these conventions: