A KWIC (Key Word In Context) index is an old, pre-digital way of looking things up, somewhat similar to a biblical concordance.
The basic idea is that there are two kinds of words in English: stop words, which do not convey any information about the content of an article (for example, "the,", "and", "of"), and keywords, which are basically everything else.
Your program will read in an arbitrary number of text files, and write out (to file) a KWIC index of all the keywords that it finds.
You should provide two (or more) files,
Your program should start by reading in a list of stop words from a file named
stop_words.txt (provided), in the same directory as your program.
Next, your program should ask the user for some text files. The user may enter any number of file names (or paths); stop reading in file names when the user enters an empty string. (Here are some I/O Examples that may be helpful.)
Next, for each file,
'). For simplicity, we will consider every apostrophe as part of a word, even if it is used to quote something.
Use Scalatest to test some or all of your functions. You don't need to thoroughly test everything, but I'd like to see evidence that you could do thorough testing if you wanted to.
461 If you ever, even once, recur with the same (or harder) pro 623 ler array. So we will plan to recur only with smaller arrays, and 406 o the question of when to use recursion is simply, when 415 good rule of thumb is to use recursion when you're processing 621 We will use recursion to find the maximum value
The above description refers to the "map" and "list" data structures. These are meant to be general terms, not specific data structures. Use whichever Scala data structures you feel are most appropriate. Do, however, use the Scala versions, not the Java versions.
Strings in Scala are exactly the same as Java Strings, and all the usual Java methods apply. The Scala class
StringOps contains a large number of additional methods, some of which you may find useful. In particular, the Scala
format method uses
java.util.Formatter.format (similar to
printf in C).
The latest version of Scala appears not to contain some jars necessary for using Scalatest. I use the Scala IDE based on Eclipse, and added the following external jars:
I think these are the most recent versions of everything, and in any case they seem to be mutually compatible. You may need a slightly different set of jars. Because this is essentially a configuration issue rather than a language issue, the use of Scalatest will be only 10% of the grade on this assignment.
Here is some sample code: Fraction.scala and ExampleTests.scala.
There are some things you should know about Scala style. While we will not be grading on Scala-specific style, you will find that your program is easier to write and debug if you make some attempt to follow these suggestions.
nullwhere you can use
nullonly in calls to Java methods).
Turn your assignment in to Canvas before 6am Monday, December 8. Important note: Canvas will be set to disallow submissions after 12:01 am, December 10. This should give us enough time to finish grading and submit final grades before our deadline.