| CIT 591 Eighth Java Assignment: Simple
Language Translation David Matuszek, Fall 2002 |
Language translation is hard. Words should be translated in different ways according to context. Word order carries part of the meaning, and varies from language to language. See Babel Fish for a state-of-the-art translation system.
However, simple word-by-word translation, with no attention to context or word order, is a simple mechanical procedure, and is sometimes (but not always) sufficient to convey the gist of a message. This is the kind of translation we will do.
Your assignment:
Read in a "dictionary" from a text file, then read in a text passage from a second text file. Use the dictionary to perform a word-by-word translation, and display the result in a GUI.
Please read these requirements carefully.
Details:
1. The Dictionary.
Read in a "dictionary" from a text file and save it in a Hashtable
(described below). Each line of the dictionary file contains:
There may be whitespace before the source word, before and/or after the equals sign, and after the target word(s). This whitespace should be ignored. If the target consists of more than one word, the whitespace between words should be retained.
There may be blank lines in the dictionary file; these lines may contain whitespace characters, or they may be empty (contain no characters at all). Blank lines should be ignored.
I am providing a LineReader class
that you can use to read text files, one line at a time. It is your job to extract
the source word and the target word(s). Note that if the target consists of
more than one word, it can be left as a simple string--you don't need to break
it into separate words.
Once you get the source word and target word(s), you need to store them in
a Hashtable. A Hashtable is used to hold a set of
key-value pairs: keys are used to look up values. In this case,
you should use the source words as the keys, and the corresponding target words
as their values. When you read a translation pair from the dictionary file,
put it into the Hashtable. Later, you will use the source words
to look up the corresponding target words.
Use the Java API to figure out how to use a Hashtable (it's in
java.util)--but remember, all you need to do is (1) create a Hashtable,
(2) put key-value pairs into the Hashtable, and (3) use keys to
get values from the Hashtable.
Notes:
- There are other things you need to do to use a
Hashtablefor objects that you define, but as long as you only use it forStrings, you can ignore all the other stuff.Hashtables are case sensitive; if you put something in theHashtablewith the key"apple", you won't find it if you look for"Apple".
2. The text passage
The text passage will consists of one or more lines containing words and punctuation
marks. An apostrophe (') or a hyphen (-) should be
considered as part of the word; these characters will not be used as
punctuation. Words will not be broken across lines.
As you read the text, you should extract words, punctuation marks, and blank lines.
LineReader's readLine() method returns
a String with no '\n' at the end of it; a
blank line will be returned either an empty String, or as a String
containing only whitespace characters. You should use methods both from the String class and from
the java.util.regex package. Use them in whatever proportions seem
most useful, but use both.
The first word of every sentence should be capitalized. A word is the first word of a sentence if it is the first word in the file, or if it follows a period, question mark, or exclamation point. Capitalization in the source text should usually be ignored--but if a word cannot be translated, it should be copied to the translation exactly as it appears (capitalized or not) in the source file.
3. The GUI
The GUI can be pretty simple: A Button to tell the program to
read in and translate a file (LineReader will provide the input
file dialog), and a TextArea with a vertical scrollbar to display
the results. And your name, too, of course, probably as a Label.
Each time you read a new file, you should replace the old translation in the
TextArea; don't just keep adding to the end.
4. General stuff
The program must be written as an application, not as an applet. (Why?)
Class structure is not a big deal in this program; you can get by with a single
class, if you like. However, you are welcome to use extra classes as needed.
For example, you might very well write a method to get the next "thing"
(word, punctuation, or blank line), and you might want to return both the String
you read and some indication of which of the three kinds of thing it was. But
a method can only return one thing, so you can't do that--unless you create
a Result class containing both the String and a flag telling what
the String represents.
Due date: Wednesday, November 20, before midnight.