CIT 594 Assignment 1: Tokenizer
Spring 2004, David Matuszek |
Purposes:
- To give you basic familiarity with Eclipse
- To give you basic familiarity with JUnit testing
General Idea:
Write a program that, given a String, breaks it into a series
of tokens. A token is one of: A word, a number, a punctuation mark, an
end-of-line, or an end-of-input.
Write your program using Eclipse. Provide complete JUnit tests for your program,
and document it fully with javadoc comments.
Details:
Write the following classes:
Token
A Token has a type and a value. The value is a
String containing the exact characters that make up the
Token. The type is an int that tells what kind of
thing the Token represents--a name, a number, a symbol
(punctuation mark). End-of-lines and the (one) end of input are also
returned as tokens.
Your token class should include the following public static final
ints:
NAME -- begins with a letter, consists of letters,
digits, and underscores
NUMBER -- one or more digits (only)
SYMBOL -- any single punctuation mark (not including
whitespace)
EOL -- an end of line character
EOI -- the end of input
ERROR -- an error (for now, trying to get another token
after getting an EOI)
The token class should also include the following constructors/methods:
public Token(String value, int type) -- constructor
public String getValue() -- a getter method for the
token's value
public int getType() -- a getter method for the token's
type
public boolean equals(Object o) -- a test for equality
of tokens
|
Tokenizer
The Tokenizer will have at least the following constructors/methods:
public Tokenizer(String input) -- constructor (sets
the string to be tokenized)
public boolean hasNext() -- returns true if
there are more tokens to be returned.
public Token next() -- returns the next token
from the string.
|
|
TokenTest
A JUnit test class for Token. You will be graded on completeness.
|
TokenizerTest
A JUnit test class for Token. You will be graded on completeness
|
Comments:
Probably the simplest way to write this program is to create an instance of
StringTokenizer inside your Tokenizer class, and adapt
its results to the requirements of this assignment.
You don't need a main method. All your testing can be done via
JUnit.
The Token and Tokenizer classes should be quite simple.
The difficult parts of this assignment are (1) getting used to Eclipse, and
(2) learning to construct JUnit tests. You will probably find the TokenizerTest
class much more challenging than the Tokenizer class itself. There
are two significant advantages to having the JUnit tests, though:
- If you have a thorough set of tests (and if you have read and followed
the assignment carefully), you will almost certainly get a perfect score on
this assignment.
- Later assignments will almost certainly require modifications to the
Token
and Tokenizer classes, and the JUnit tests will be very helpful
in making those modifications.
Due date:
Tuesday, January 19, before midnight (zipped and submitted via Blackboard).