594 Hash Tables
Spring 2006, David Matuszek
In this assignment you are to
reinvent the wheel implement your own
version of a bucket hash. You will measure how fast it performs, both in an
absolute sense and in a Big-O sense, and you will compare it against a Sun-provided
hash table class.
A bucket hash, you should recall, is a hash table that consists of an array of linked lists (buckets). Your hash code tells you which list an item belongs in, and you insert or retrieve the item from that list.
Create the following classes:
public class HashTable implements Map
static class Pair-- a static inner class of HashTable; holds a
static class Cell-- a static inner class of HashTable; holds a
static class List-- a static inner class of HashTable; holds a header to a linked list of
public class ListTest extends TestCase
public class HashTableTest extends TestCase
public class Thing-- a class of objects to use as test data
public class HashTableTimer-- used to time your
Here's the overall structure:
HashTablewill be an array of a reasonable size, preferably a prime number (101 is good).
Listwill contain a
headerto a linked list of
Cellwill contain, in its
Pairwill contain a
value, both of which are
To put something in your hash table, you call
This method should (1) get the
hashCode() of the
(2) use this hash code to find an index into the array, and (3) search the list
at this index location for a
Cell containing a
with a matching
key field. If you find such a
then (4a) replace the
value field in the
value; but if you don't find such a
then (4b) create a new
Cell containing a new
and add it to this
Retrieving something from your hash table will require some of the same operations.
One slightly tricky bit is that there is no use for the classes
List outside of the
class, so it's bad style to expose these to the user (especially
which will be an incomplete implementation). For this reason, they should be
inner classes of
HashTable. Ideally, they should be private
inner classes, but this gets in the way of testing
List, so give
them default (package) access. Finally, to avoid some tricky syntax, they should
static inner classes; this lets you refer to them as
Start with the provided code, which contains:
public class HashTable implements Map-- You need to implement most of this
static class Pair-- I've implemented this for you
static class Cell-- I've implemented this for you
static class List-- You need to implement this
public class Thing-- I've implemented all but one method for you
I've marked with TODO flags the things you need to add to these classes. (In Eclipse, select Window → Show View → Other... → Basic → Tasks to get tab that shows a list of these.) Of course, you should delete these flags when you have completed the indicated tasks.
public class Thing
HashTable class should, of course, be able to use (almost)
any kind of
Object as a key, and any kind of
as a value. I'd like you to use
Things for your testing, both as
keys and as values.
Thing class is almost complete, but lacks a
method. Look up
java.lang.Object.hashCode() in the Java API, and
be sure you satisfy the "general contract." Also be sure you are actually
overriding the corresponding inherited methods in both cases (use the
tag). No other methods are needed; we are only going to use this class to test
static class List(inside
List class next, since the
class depends on it.
List class has a single instance variable,
which references (points to) the first
Cell in the list. If the
list is empty,
You need the following methods:
public void add(Pair pair)
public Pair find(Object key)
public int size()
public Set keySet()
As usual, I recommend that you start by writing the JUnit tests for these methods.
public class HashTable implements java.util.Map
This class will implement an interface that declares lots of methods. Let Eclipse
add the required methods for you (Add unimplemented methods). You
will only implement a few of these methods:
equals. Leave the rest as stubs. Add Javadoc comments to all
public methods and instances; for the unimplemented methods, just say they're
The generated code does not use generics. For this assignment, that's OK.
Write JUnit tests for the following methods (let Eclipse generate the test stubs for you):
If you are not clear about what these methods should do, see the Java API for
put are fundamental to all hash
tables. The methods
are simple to write and convenient to have. The
keySet method returns
Set, and a
Set has an
which you can use when you write
will be very useful in your testing.
In your testing, I recommend the following: In addition to your
java.util.HashMap. Whatever you do (get, put, etc.) to
HashTable, also do the the
HashMap, then test
if the results are equal (if you override
assertEquals method will use it).
Do your testing with
Things for both keys and the values.
public class HashTableTimer
When you have your HashTable class working, time how long it takes to do the
put methods.Time these separately.
Here's how you time code:
long startTime = new GregorianCalendar().getTimeInMillis(); // Coded to be timed goes here long endTime = new GregorianCalendar().getTimeInMillis(); long elapsedTime = endTime - startTime;
Since any given execution should take far less than a millisecond, the code
you are timing should be a loop that does numerous calls to
get). Do multiple timings, with differing numbers of calls,
so that you can see how time increases as you add more and more things to your
hash table (I used various values from 100 to 100000). Print out your results.
Since you are trying to time particular methods, the less additional work done
within the timed part, the better. In particular, you should create an array
Things to use before you enter the loop.
If you carefully follow the DRY principle, your timer method should be quite short (Mine is <70 lines, including comments). You don't need lots of loops.
After you have timed your own get and put methods, time the same methods of
java.util.HashMap. Since both
Maps, you can use the same timer method for both, as long as
it takes a
Map as a parameter.
HashTable methods will be much slower than the
methods. That's OK.
Write up your results, including your timing measurements. Try to answer the following questions:
getmethods ought to be?
getmethods really? Do the results agree with what you think they should be?
HashMapis using the same basic algorithm (bucket hash) as your
Note: Your writeup will be part of your grade on this assignment.
Please zip up all the files in this project, and submit them via Blackboard by midnight Thursday, February 16.