Homework assignments for CIS 455 / 555

Resources For most assignments, we will provide a virtual machine image that contains all the necessary tools. To use this image, you will need VirtualBox, which we recommend downloading here.

Development will be in Java. We recommend the use of Git, a version control system, for maintaining your project code; if you are not familiar with Git, please have a look at the documentation. As a development environment, you may want to use Eclipse, possibly in combination with the EGit plug-in.

Assignment 0 Getting started

This very simple assignment will show you how to use the virtual machine image we have prepared for you. You also need to download the VM image.

Assignment 1 Web and application server

Some useful URLs:

Assignment 2 Milestone 1
Assignment 2 Milestone 2
Web crawling, XPath, XQuery

For testing, we have set up a sandbox that you can safely crawl. In preparation for Milestone 2, you may want to read about StormLite, our Apache Storm stream processing emulator.

Assignment 3 Storm and MapReduce

If you are planning to use EC2 for this assignment, you should have a look at our Getting Started Guide.

Final Project Distributed web crawler and search engine

In addition to the PDF, you may find the following useful: