CIT 591 Eighth Java Assignment: Simple Language Translation
David Matuszek, Fall 2002

Purposes of the assignment:

The basic idea:

Language translation is hard. Words should be translated in different ways according to context. Word order carries part of the meaning, and varies from language to language. See Babel Fish for a state-of-the-art translation system.

However, simple word-by-word translation, with no attention to context or word order, is a simple mechanical procedure, and is sometimes (but not always) sufficient to convey the gist of a message. This is the kind of translation we will do.

Your assignment:

Read in a "dictionary" from a text file, then read in a text passage from a second text file. Use the dictionary to perform a word-by-word translation, and display the result in a GUI.

Please read these requirements carefully.

Details:

1. The Dictionary.

Read in a "dictionary" from a text file and save it in a Hashtable (described below). Each line of the dictionary file contains:

  1. One word from the "source" language (the language we are translating from),
  2. An equals sign, and
  3. One or more words in the "target" language (the language we are translating to).

There may be whitespace before the source word, before and/or after the equals sign, and after the target word(s). This whitespace should be ignored. If the target consists of more than one word, the whitespace between words should be retained.

There may be blank lines in the dictionary file; these lines may contain whitespace characters, or they may be empty (contain no characters at all). Blank lines should be ignored.

I am providing a LineReader class that you can use to read text files, one line at a time. It is your job to extract the source word and the target word(s). Note that if the target consists of more than one word, it can be left as a simple string--you don't need to break it into separate words.

Once you get the source word and target word(s), you need to store them in a Hashtable. A Hashtable is used to hold a set of key-value pairs: keys are used to look up values. In this case, you should use the source words as the keys, and the corresponding target words as their values. When you read a translation pair from the dictionary file, put it into the Hashtable. Later, you will use the source words to look up the corresponding target words.

Use the Java API to figure out how to use a Hashtable (it's in java.util)--but remember, all you need to do is (1) create a Hashtable, (2) put key-value pairs into the Hashtable, and (3) use keys to get values from the Hashtable.

Notes:

2. The text passage

The text passage will consists of one or more lines containing words and punctuation marks. An apostrophe (') or a hyphen (-) should be considered as part of the word; these characters will not be used as punctuation. Words will not be broken across lines.

As you read the text, you should extract words, punctuation marks, and blank lines.

You should use methods both from the String class and from the java.util.regex package. Use them in whatever proportions seem most useful, but use both.

The first word of every sentence should be capitalized. A word is the first word of a sentence if it is the first word in the file, or if it follows a period, question mark, or exclamation point. Capitalization in the source text should usually be ignored--but if a word cannot be translated, it should be copied to the translation exactly as it appears (capitalized or not) in the source file.

3. The GUI

The GUI can be pretty simple: A Button to tell the program to read in and translate a file (LineReader will provide the input file dialog), and a TextArea with a vertical scrollbar to display the results. And your name, too, of course, probably as a Label. Each time you read a new file, you should replace the old translation in the TextArea; don't just keep adding to the end.

4. General stuff

The program must be written as an application, not as an applet. (Why?)

Class structure is not a big deal in this program; you can get by with a single class, if you like. However, you are welcome to use extra classes as needed. For example, you might very well write a method to get the next "thing" (word, punctuation, or blank line), and you might want to return both the String you read and some indication of which of the three kinds of thing it was. But a method can only return one thing, so you can't do that--unless you create a Result class containing both the String and a flag telling what the String represents.

Due date: Wednesday, November 20, before midnight.