Scala Assignments 2 and 3: Book Indexing
Fall 2011, David Matuszek

Purposes of this assignment

General idea of the assignment

Create an index for a book.

Since we are studying Scala, the obvious book to use is the first edition of Martin Odersky's Programming in Scala, which is freely available on the web. (Caution: Scala’s collections have been significantly revised since the first edition.) Of course, this book already has a perfectly good index, but this is a class project, not an actual job. I am providing Odersky-text.zip, which is a collection of 33 text files representing the 33 chapters of the book.

Your assignments are (1) create the index using the tools supplied by Java, and (2) create the index using Scala’s actors. We are doing (1) because, if you find yourself in a position where you have to deal with concurrent code, it will probably be in Java, and (2) because actors are supposed to be better; they certainly have become popular recently.

These assignments are combined because a lot of the text processing code can be the same in both versions. However, please turn them in as two separate assignments.

Here's the approach that I have in mind (each box indicates a separate Thread or Actor):

process organization

Since we will be dealing with text files, not book pages, our index will link to (chapter number, line number) instead of to page numbers.

While approach (1) will be using Java tools, both programs should be written in Scala. Scala has excellent access to Java libraries.

There are two unresolved questions:

Without good answers to these questions, our programs will produce terrible indices. For now, that's okay. We will have class discussion on these issues, and see if we can come up with some reasonably simple solutions,

Due date

Wednesday, December 7, 6am. Zip each project and turn in to Blackboard. The usual late penalty of 5 points/day will apply, except that at some (unspecified) date we will get the programs graded, and no further submissions will be accepted.