CIT 594 Assignment 6: Recognizer for "Bugs" Language
Spring 2015, David Matuszek

Purposes

General Idea

In this assignment I define a BNF grammar for a small programming language, which I am calling Bugs.

Your assignment is to write a recognizer (class Recognizer) for productions (rules) in that grammar. A recognizer is a program that, given a stream of tokens and a nonterminal, decides whether or not that stream of tokens is an instance of that nonterminal. The meaning of the production, if any, is outside the scope of this assignment.

There will be no "main program." Instead, you should have JUnit tests for every nonterminal, and every definition of that nonterminal, defined in the grammar. When your test methods can accept every correct input, and reject every incorrect input (either by reporting false or throwing an exception), you are done.

For example, if the nonterminal <factor> is defined by

<factor> ::= <function call> | <NAME> | <NUMBER> | "(" <expression> ")"

The stream of tokens SYMBOL:(  NUMBER:2  SYMBOL:+  NUMBER:3  SYMBOL:) (the tokenization of the String "(2+3)") is recognized by this production.

There is a nonterminal called <program>. Later this will be important, but for this assignment, there is nothing special about the <program> nonterminal.

Supplied files

The Eclipse project containing the following starter files can be cloned from https://github.com/DavidMatuszek/Bugs-Language.

Token.java

This class defines an enum Token.Type with values Token.Type.KEYWORD, Token.Type.NAME, Token.Type.NUMBER, Token.Type.SYMBOL,Token.Type.ERROR, Token.Type.EOL, and Token.Type.EOF; these denote keywords, words, numbers, punctuation marks (including operators), errors, end-of-line, and end-of-file, respectively. It has two fields, Token.Type type and String value which, for easy access, are not private.

Recognizer.java

This is starter code for your assignment. It recognizes <arithmetic expression>, <factor>, <term>, <add operator>, and <multiply operator>. It has some helper classes which you should get familiar with (generate and read the Javadoc!), and, most importantly, it has methods Token nextToken() and pushBack(), which hide the ugly details of using a StreamTokenizer. You should familiarize yourself with all the "helper" methods--they are useful!

RecognizerTest.java

More starter code.

SyntaxException.java

This is just a RuntimeException with a more specific name. The class is complete; you don't need to add anything to it.

Details

The complete grammar is on a separate page for easy reference.

For each nonterminal xxx in the grammar (excluding <NAME>, <NUMBER>, <SYMBOL>, and <EOL>), you should have one public boolean isXxx() method. Name your methods the same as the nonterminals they recognize, but with the is- prefix. The Recognizer.java class has the required methods for <arithmeticExpression> and several other nonterminals; you should use these as examples of how to write your is- methods.

You should have an "is-Method" for each additional nonterminal defined in the grammar. Specifically,

isAction
isAddOperator
isAllbugsCode
isArithmeticExpression
isAssignmentStatement
isBlock
isBugDefinition
isColorStatement
isCommand
isComparator
isDoStatement
isEol
isExitIfStatement
isExpression
isFactor
isFunctionCall
isFunctionDefinition
isInitializationBlock
isLineAction
isLoopStatement
isMoveAction
isMoveToAction
isMultiplyOperator
isParameterList
isProgram
isReturnStatement
isStatement
isSwitchStatement
isTerm
isTurnAction
isTurnToAction
isVarDeclaration

Each of these methods should be spelled correctly (according to the Grammar), should be public boolean, and should take no arguments. Our JUnit tests will depend on each of these methods being present and being spelled correctly.

The provided Recognizer also has some very useful (private) utility methods; redundant code is bad style, so get familiar with these utility methods before you waste time writing equivalent code.Each method should do one of three things:

Provide complete JUnit tests for all public methods.

Write Javadoc comments for all non-private methods. The Javadoc for each isXxx() method should include the EBNF that the method implements.

Caution: Some definitions contains braces { and } as metasymbols, and some definitions use "{" and "}" as terminals. Don't get these confused!

Recommended approach

The best way to do this assignment is as follows:

  1. Create stubs for all the unwritten Recognizer methods, and create JUnit tests for those methods.
  2. Pick some unwritten method that doesn't depend on other unwritten methods, and turn the test method stub for it into an actual test. For example, <comparator> doesn't depend on anything else, but <expression> depends on <comparator>, so write the test code for <comparator> first.
  3. Run the test. It should fail. Make certain that it does. (This can happen if there is a problem with your test.)
  4. Write the code for the method you are testing (that is, fill in the method stub in the Recognizer class).
  5. Test and debug, test and debug, until your test passes. If you wrote adequate tests, you should be done with this method. If your tests pass and you know the method isn't finished, add a test that fails.
  6. Repeat until all methods have been coded.

JUnit testing

Tokens form the foundation of the grammar. Low-level grammar rules, such as <term>, are built on these; higher-level rules, such as <expression>, are built on lower-level rules; and so on, all the way up to <program>. If <program> is to be correct, everything, all the way down, has to be correct. Hence, JUnit testing is essential.

You should test that each "is-Method" recognizes both simple and complex versions of the nonterminal it is intended to describe. In addition, you should test that each "is-Method" throws an exception when it should. However, you only need to check for SyntaxExceptions that should be thrown by the method you are testing, not by lower-level methods. For example, the isStatement() method checks for seven different statement types; you should check that isStatement() will recognize each of those statement types, but don't check for all the possible exceptions thrown by all seven statement types! Trust your JUnit tests of the seven statement types to do that for you.

Due date

Tuesday, February 24, before 6am. Zip up the package containing your files and submit the .zip file via Canvas.