CIT 594 Getting Started on the Parser
Spring 2008, David Matuszek

Remember that a parser is made by taking a recognizer and adding code to it. A good way to begin is to make copies of your Recognizer.java and RecognizerTest.java files, and rename them Parser.java and ParserTest.java, respectively.

I am supplying some starter files, also named Parser.java and ParserTest.java. These have revised versions of all the expression handling methods (isExpression, isTerm, isParameterList and others). You can just replace your corresponding methods with these. The expression handling methods depend on a number of "helper" methods, which I have also supplied. So, the first thing you need to do is to "transplant" my methods into your code, and get the whole thing working again.


Again, you make a parser by adding code to a recognizer. The code you need to add is code to make trees and put them on a stack. I do this by calling a (supplied) makeTree method. You will need to either understand this method thoroughly, or else write something similar for yourself. Here's a brief explanation, but you really need to look at the code.

makeTree takes two or more integers, and uses those as indexes into the global stack, counting the top of the stack as 1. It then builds a tree, using the first indexed item as the root. It then removes the topmost several elements from the stack, and places the new tree on the stack in their place.

For example, if root, child1, child2, and child3 are the integers 1, 2, 3, 4 in some order, then makeTree(root, child1, child2, child3) creates a new tree with the root'th element from the stack as the root and the remaining three elements from the stack as its children; removes four elements from the stack; and pushes the new Tree onto the stack.

As makeTree is defined, its arguments must be consecutive positive integers 1..n, in any order. Thus, makeTree(3, 4, 2, 5, 1) would be fine, but makeTree(2, 4, 6) would behave incorrectly; it would build the tree from elements 2, 4, and 6, but it would remove elements 1, 2, and 3 from the stack and replace them with the new Tree.


The next step would be to take one of the simpler grammar rules and try to implement it. For instance,
     <move action> ::= "move" <expression> <eol>
is a good one to start with, because it's simple (and because <expression> is already defined.)

Here's a simple first test:

    @Test
    public void testIsMoveAction() {
        use("move 5 \n");
        assertTrue(parser.isMoveAction());
        assertStackTopEquals(createTree("move", "5.0"));
    }
If you now write isMove in the obvious way,
    public boolean isMoveAction() {
        if (keyword("move")) {
            if (isExpression()) {
                if (isEol()) {
                    makeTree(2, 1);
                    return true;
                }
            }
        }
        return false;
    }
(omitting error checks), you will quickly find that either (1) isEol() is not defined, or (2) you get a Tree with "5.0" as the value in the root, and "\n" as the value in its one child. This is because the above code put three things on the stack: "move", a Tree representing the expression, and a token for the end-of-line.

The Tree we are building is an abstract syntax tree (AST), with the emphasis on abstract. Some tokens go into the tree; some don't. The arithmetic operators (+, -, etc.) become part of the tree; parentheses don't. Tokens whose only purpose is to help describe the structure of the tree (commas, parentheses, braces, etc.) don't become part of the tree. In particular, an end-of-line is structural only (it separates statements) so it should not be part of the AST. In this example, the best solution is to write or modify your isEol() method so that it never ever leaves an end-of-line on the stack.

(By the way, it might help to know, as you are debugging, that the Stack class has a good toString() method, so System.out.println(stack); works fine.)


Next, of course, is to enhance your testing of the isMoveAction() method, possibly with a more complex expression, certainly to make sure it throws Exceptions as needed. Then update the method to do those things, and move on to the next method.

Like the Recognizer assignment, the Parser assignment is long, but once you get the first few methods working, the rest become relatively routine. If you make good use of TDD (Test Driven Design, remember?), you shouldn't have too many unpleasant surprises or hard-to-find bugs.