| CIT
594 Assignment 10: Automata Spring 2006, David Matuszek |
In this assignment you will implement a very simple type of "computer" with a complex name, a nondeterministic finite-state automaton (NFA). NFAs are graph structures.
Name your main class NFA. Use your very own Graph API (from the
previous assignment) to read in the NFA graph and "walk" it to process
input. Do not make any changes to your Graph API unless absolutely necessary
(to fix bugs, or to correct serious inadequacies). Do not use the Plex
class in your NFA classes; they should not "know" that this is how
your graphs are implemented.
An automaton can be represented as a graph, with the nodes representing "states" of the automaton, and the edges representing "transitions" between states. Here are the usual conventions for drawing automata:
![]() |
![]() |
![]() |
| The (one) start state is indicated by an arrow pointing to it. This arrow does not indicate an edge in the graph! | Transitions are represented by labeled edges. The label on the edge indicates which input will cause the transition to occur. | Final states are represented by double circles. (Think of these as "goal nodes.") |
Here's how a deterministic finite-state automaton (DFA) works. For each new input sequence,
In a DFA, at each node there will be at most one outpointing edge with any given label. That means that the next state is completely determined by the input.
In an NFA,
EMPTY, meaning that you may
choose to follow the edge, but you aren't required to. This does not use up
an input symbol; andYou will need to read in a description of an NFA and construct the corresponding graph. Here is an example of the input:
STATE q0 q1 q2
SYMBOL + - 9
TRANSITION q0 + q1 q0 - q1 q0 EMPTY q1
q1 9 q2 q2 9 q2
START q0
FINAL q2
INPUT 9 9 9
INPUT + 9 + 9
|
![]() |
Here is a more precise definition of the required input:
STATE, SYMBOL, TRANSITION, START,
FINAL, and INPUT denote sections of the input. They
must all occur, and must occur in this order. These words cannot be used as
the names of nodes or the labels on edges.STATE is the name of a node on the graph.SYMBOL is an allowable edge label. Edge
labels may be used on as many edges as desired.TRANSITION must be in groups of three, and
each group describes a directed edge on the graph, in the order: Name of source
node, edge label, name of destination node.
EMPTY may be used to denote
an "empty transition." START; it is the name of
the start state (node).FINAL; these are the final
states. (Note: It is possible for the start state to also be a final state.)INPUT denotes a series of zero or more SYMBOLs
to be given to the automaton to recognize.INPUT may occur more than once; each occurrence provides
a separate, independent sequence of symbols to be given to the automaton.
(In other words, finish processing the old input, and start again at the start
state with the new input.) STATE. When you see this word, throw away the old graph and reinitialize
everything. I'm trying to be very precise about the form of the input file, because your input method will need to be able to read our test cases. I will also supply some sample input data. We will test with correct input, so checking the input for errors is not necessary.
Use a JFileChooser to choose an input file to execute. Allow the
user to do this as many times as he/she likes without having to restart your
program.
Use Java 5's Scanner class to read each file. Scanner
is a fairly powerful Sun-supplied tokenizer, but for this assignment the only
"tokens" you need are "words" (sequences of printing characters)
separated by whitespace. You can construct a Scanner to read from
a String (as in the following example) or from a File
(which will be more useful in this assignment).
Program Output import java.util.Scanner; public class SimpleScanner { public static void main(String[] args) { String test = " This is an \t input\nString "; Scanner s = new Scanner(test); while (s.hasNext()) { System.out.println("\"" + s.next() + "\""); } } }"This"
"is"
"an"
"input"
"String"
After reading in the graph, print it out.
For each input, test whether the input is accepted by the NFA, and print the result (whether accepted or rejected). Do the test with a recursive backtracking algorithm.
If the input is accepted, print a trace of the successful path taken. A trace consists of a sequence of node and edge labels (or equivalently, a sequence of node labels alternating with inputs). For example:
Input:
+ 9 9Trace:
q0 + q1 9 q2 9 q2
If you start printing the trace at the lowest level of a successful recursion, and print at each level as you return, the trace will come out backwards. The simplest solution is this: Instead of printing as you return, put the result in a global stack, and print the stack after you exit from the recursion.
As a reminder, here is how backtracking works:
boolean explore(N) {
If N is a goal node, return success
If N is a leaf node, return failure
For each child C of N,
If explore(C) return success
Return failure
}
To simplify the assignment, you can assume that there will be no cycles in the graph that contain only empty transitions. In other words, you don't have to worry about getting into an infinite loop; the only way you can get from a node back to itself is by consuming at least one input symbol.
In Eclipse, create a new package named graph. Drag all
your Java files (both Plex and graph) into this new package. (This
will add a package directive as the first line of each file). Next,
change every method in the Plex class from public
to protected (the only thing left public should be
the value field.) Put your NFA classes in a separate
package named nfa, and import graph.* into each NFA
class as needed.
Monday, April 24, before midnight. No extensions except for serious
reasons. Turn in the complete project (including both nfa and graph
packages) in the usual way.