Notes on Syntax Coloring Assignment
CIT 591, David Matuszek, Fall 2001

If you would like to know more about HTML entities, look at The important point to notice is that there are a lot of them.

The assignment mentions a number of things that syntax coloring might be used for. Your assignment is only to do these four:

I described five main states that your machine should have. My implementation used ten states. Why? So I could process the input one character at a time. For instance, when I encountered a / (slash), I went to a new state that tested whether the following character was another slash, an asterisk, or something else. If you don't understand this, try to figure it out yourself by drawing a state machine that handles the input one character at a time.

It's a good idea to draw the state machine before you begin writing Java code.

Use my LineReader class. I said this already. Use my LineReader class. Or, at least, look at it for an example of what you need to do. You may want to change the calls to System.err.println to be calls to System.out.println.

I strongly recommend that you write a method that returns individual characters from the input file. There is no reason to clutter up your state machine by trying to figure out what the next character should be. Your method can use my LineReader class. Remember that the newline, '\n', is a valid character. Your output file should preserve the original line structure: Except for the initial and final HTML tags, your output should contain the same number of lines as your input.

If you detect an error--an unclosed String literal or an unclosed char literal--do not go to an error state. It would be nice if you call attention to the error somehow, but your state machine must finish processing all the input. Your program is supposed to do syntax coloring, and if it encounters an error it must recover somehow and keep on coloring.

While it is possible to use a state machine to recognize keywords, doing so will result in a very large number of states. Here is a better solution: when you encounter a letter, go into a state that collects all the letters of the word; then check whether it is a keyword. Your textbook has a list of keywords on page 30. If you do this, do not write extra code for each keyword--just have an array containing all the possible keywords, and check to see if the word you just encountered is in the array.

Once your program is producing output, you should open your output file in a browser, to see if it's colored as you expect. It should look very much the way it does in BlueJ, except that your colors will probably be different. If you just look at the output file in a text editor, you will probably fail to see many problems. We are going to look at your output file in a browser--you should, too.

Help! My output is all on one line!

If you are not recognizing newline characters, '\n', it's probably because LineReader.readLine() doesn't return newline characters. It just returns one line at a time. If you want newline characters, you have to re-insert them:

line = lineReader.readLine() + '\n';

Test File

I have been asked whether I will provide a test file. No, I will not provide a test file for this program. Part of programming is figuring out all the strange things that might happen, and being prepared for them. We will test your program for certain strange things (for example, a string literal that isn't closed by the end of the line), but it's your job to test your program thoroughly before you hand it in.

If something illegal occurs in the program,
          You do not have to provide error messages.
          You do have to recover from the error.
For example, a quoted string cannot extend across more than one line; so if you find the end of line while you are stepping through a quoted string, you must end the special coloring for a string. You don't have to print any error messages or anything like that; just get back to the "normal" state as soon as possible.

While I will not provide a test file for your program, I will provide a sample. For a trivial program with no syntax errors, your output should look something like the following. This is a sample, not an adequate test; this program doesn't have examples of all the things that might be wrong. The purpose of this sample is to clarify what your output should look like.

Input file: Output file:

 * This is an example of syntax coloring.
public class Example {
    // this is a one-line comment
    public static void main(String args[]) {
        int i = 10;
        if (i < 5 && i > 0) {
<TITLE>Syntax coloring assignment by John Doe</TITLE>
<PRE><font color=green>/** * This is an example of syntax coloring. */</font> public class Example { <font color="#990099">// this is a one-line comment</font> public static void main(String args[]) { int i = 10; if (i &lt; 5 &amp;&amp; i &gt; 0) { System.out.println(<font color="#990099">""Oops!"</font>); } } } </PRE>