| Notes on Syntax Coloring
Assignment CIT 591, David Matuszek, Fall 2001 |
If you would like to know more about HTML entities, look at http://www.w3schools.com/html/html_entitiesref.asp. The important point to notice is that there are a lot of them.
The assignment mentions a number of things that syntax coloring might be used for. Your assignment is only to do these four:
/* ... */ comments
// comments
I described five main states that your machine should have. My implementation
used ten states. Why? So I could process the input one character at a time.
For instance, when I encountered a / (slash), I went to a new state
that tested whether the following character was another slash, an asterisk,
or something else. If you don't understand this, try to figure it out yourself
by drawing a state machine that handles the input one character at a time.
It's a good idea to draw the state machine before you begin writing Java code.
Use my LineReader class. I said this already. Use my LineReader class. Or,
at least, look at it for an example of what you need to do. You may want to
change the calls to System.err.println to be calls to
System.out.println.
I strongly recommend that you write a method that returns individual
characters from the input file. There is no reason to clutter up your state
machine by trying to figure out what the next character should be. Your method
can use my LineReader class. Remember that the newline, '\n',
is a valid character. Your output file should preserve the original line structure:
Except for the initial and final HTML tags, your output should contain the same
number of lines as your input.
If you detect an error--an unclosed String literal or an unclosed
char literal--do not go to an error state. It would be
nice if you call attention to the error somehow, but your state machine must
finish processing all the input. Your program is supposed to do syntax coloring,
and if it encounters an error it must recover somehow and keep on coloring.
While it is possible to use a state machine to recognize keywords, doing so will result in a very large number of states. Here is a better solution: when you encounter a letter, go into a state that collects all the letters of the word; then check whether it is a keyword. Your textbook has a list of keywords on page 30. If you do this, do not write extra code for each keyword--just have an array containing all the possible keywords, and check to see if the word you just encountered is in the array.
Once your program is producing output, you should open your output file in a browser, to see if it's colored as you expect. It should look very much the way it does in BlueJ, except that your colors will probably be different. If you just look at the output file in a text editor, you will probably fail to see many problems. We are going to look at your output file in a browser--you should, too.
Help! My output is all on one line!
If you are not recognizing newline characters, '\n', it's probably
because LineReader.readLine() doesn't return newline characters.
It just returns one line at a time. If you want newline characters, you have
to re-insert them:
line = lineReader.readLine() + '\n';
Test File
I have been asked whether I will provide a test file. No, I will not provide a test file for this program. Part of programming is figuring out all the strange things that might happen, and being prepared for them. We will test your program for certain strange things (for example, a string literal that isn't closed by the end of the line), but it's your job to test your program thoroughly before you hand it in.
If something illegal occurs in the program,
You do not have
to provide error messages.
You do have to
recover from the error.
For example, a quoted string cannot extend across more than one line; so if
you find the end of line while you are stepping through a quoted string, you
must end the special coloring for a string. You don't have to print any error
messages or anything like that; just get back to the "normal" state
as soon as possible.
While I will not provide a test file for your program, I will provide a sample. For a trivial program with no syntax errors, your output should look something like the following. This is a sample, not an adequate test; this program doesn't have examples of all the things that might be wrong. The purpose of this sample is to clarify what your output should look like.
| Input file: | Output file: |
/**
* This is an example of syntax coloring.
*/
public class Example {
// this is a one-line comment
public static void main(String args[]) {
int i = 10;
if (i < 5 && i > 0) {
System.out.println("Oops!");
}
}
}
|
<HTML> |