CIT 594 Assignment 1: Tokenizer
Spring 2006, David Matuszek

Purposes:

General Idea:

Write an API that, given a String, breaks it into a series of tokens. For our purposes, a token is one of: A word, a number, a punctuation mark, or an end-of-line

Write your program using Eclipse, and document it fully with javadoc comments.

Details:

Write the following classes:

public enum Type

This enumeration should have the following values: NAME, NUMBER, SYMBOL, and EOL. (Hint: This is a very small class.)

Token

A Token has public fields Type type and String value. The value holds the exact characters that make up the Token. The type tells what kind of thing the Token represents.

NAME
A name begins with a letter, and may contain any number of letters, digits, and/or underscores.
NUMBER
A number consists of one or more digits. A sign (+ or -), if present, is not part of a number.
SYMBOL
Any character that isn't whitespace and isn't part of a name or a number.
EOL
Denotes the "end of line." For this type, the value should be the empty string.

Note: To correctly recognize letters, digits, and whitespace, use the methods in java.lang.Character.

The token class should also include the following constructor and methods:

  • public Token(Type type, String value) -- constructor
  • @Override public boolean equals(Object o) -- tests if two Tokens are equal
  • @Override public String toString() -- Returns a String representation of this Token.

Note: @Override tells the Java 5 compiler that you are trying to override an inherited method, so that it can warn you if you get the signature wrong.

public Tokenizer implements Iterator

The Tokenizer will have at least the following constructor and methods:

  • public Tokenizer(String input) -- constructor (sets the string to be tokenized).
  • public boolean hasNext() -- returns true if there are more tokens to be returned.
  • public Token next() -- returns the next token from the string.
  • public void remove() -- throws an UnsupportedOperationException.
  • public void putBack(int howMany) -- steps back past the previous howMany tokens (so they can be returned again by subsequent calls to next).

Comments:

You don't need a main method--all your testing can be done via JUnit. Here are my JUnit tests, and you may add more if you like; if you add any public methods to those required above, you must provide JUnit tests for them.

Additional requirements:

Due date:

Tuesday, January 17, before midnight (zipped and submitted via Blackboard).