CIT 594 Robot Grammar: Corrections, Suggestions, Starter Code
CIT 594, Spring 2005

Corrections

  1. The grammar rule

         <command> ::= "if" <condition> <block> [ "else" ] <block>

    should, of course, be

         <command> ::= "if" <condition> <block> [ "else" <block> ]

  2. On the lecture slides for Recognizers, the definitions for <expression> given on slides 22 and 23 are

         <expression> ::= <expression> "+" <term>
         <expression> ::= <term> "+" <expression>

    In each case the additional expression should be optional,

         <expression> ::= [ <expression> "+" ] <term>
         <expression> ::= <term> [ "+" <expression> ]

    The Java code on the slide matches these corrected definitions.

Notes and suggestions

  1. The StringTokenizer class does most of what you need, but not everything. You also need to be able to push back unwanted tokens and to classify tokens into integers, names, keywords, symbols, and end-of-lines. The best way to do this is with a Decorator class that stands "in front of" the StringTokenizer class and adds the desired functionality.

    The PushbackTokenizer class in my Recognizers slides is an example of such a Decorator class; it adds a pushBack(String token) method. This is half of what you need; the other half is to classify tokens. Hence, your nextToken() method should return not only the String value of each token, but also the type of token it found.

    This means that nextToken() needs to return two pieces of data, not one. This is very simple in Java: Define a Token class, and have your tokenizer return Tokens rather than Strings. The Token class can also define constants for each token type: Token.NUMBER, etc.

  2. Although there is a fairly obvious "top level" nonterminal, <program>, there is actually nothing special about this nonterminal. Each nonterminal type should have a corresponding public method for recognizing those kinds of nonterminals, and each should be tested by your JUnit tests. For example:
        public void testAdd_operator() {
            Recognizer r = new Recognizer("+ - $");
            assertTrue(r.addOperator()); // +
            assertTrue(r.addOperator()); // -
            assertFalse(r.addOperator()); // $
            remainder(r, "$"); // my method to check what is left
        }
    Once again, I haven't asked for any kind of main method, so your recognizer is providing a capability rather than a complete program. (In this case, the capability is to be able to tell whether a given string is an expression, a command, etc.) In a later assignment, we will make use of this capability.

    Since the Recognizer is just an API, not a complete program, JUnit testing is especially important.

  3. The Tokenizer takes as input a sequence of characters, and produces tokens. The Recognizer takes as input a series of tokens (from the Tokenizer) and tries to apply a grammatical rule to recognize a (specified) nonterminal.

    Just as a tokenizer will "consume" as many characters as it needs from the input, leaving the remainder for a future call, a recognizer will "consume" as many tokens as it needs from its input, leaving the remainder for a future call.

    Because the Recognizer has the more complex job, it needs to be told what to look for. That is, a call to the Tokenizer is just nextToken(), but a call to the Recognizer is one of variable(), object(), program(), command(), or one of the other nonterminal types (I've defined 15 in all).

Starter code

I've provided an initial version of Recognizer.java, which can recognize arithmetic expressions. It assumes the existence of some other classes: Token, Tokenizer, and RobotException. Also, the definition of <factor> is not quite complete, but you can fix that easily.

Here is also a rudimentary RecognizerTest.java.

This code is a very slightly edited version of starter code that I provided to a previous CIT594 class for a similar assignment, so don't expect it to be exactly what you need; but it should be very close.

Constructor Summary
Recognizer(java.lang.String text)
          Constructs a Recognizer for the given string.
 
Method Summary
 boolean addOperator()
          Tries to recognize an <add_operator>.
 boolean expression()
          Tries to recognize an <expression>.
 boolean factor()
          Tries to recognize a <factor>.
 boolean multiplyOperator()
          Tries to recognize a <multiply_operator>.
 boolean term()
          Tries to recognize a <term>.