CIT 594 Assignment 7: Dating service
Spring 2004, David Matuszek


To give you experience with:

General Idea:

You are a programmer for a dating service.

Your company has collected questionnaires from some number w of women and some number m (probably different from w) of men, and you have used these questionnaires to compute, for each possible female/male couple, a "compatibility score" between 0 and 100, inclusive (larger numbers are better).

As our story opens, you have these numbers in an array of w rows and m columns. Your job is to write a program to match up couples so as to maximize happiness (and return business). Of course, if there are more men than women, there will be m-w men who don't get dates. Similarly, if there are more women than men, w-m women will go dateless. Tough, but there's nothing you can do about that.

You decide that the value you want to maximize is the total compatibility: the compatibility of the first couple, plus the compatibility of the second couple, etc. You suspect that it may be impossible to find an optimum solution in a reasonable amount of time, but you want to find the best solution you can.

Currently you have only a few dozen couples, but if you do a good job and make enough people happy, your business will grow to serve hundreds of people. Hence you need an algorithm that scales well.


Write a class named Compatibility with the following three methods:

Each of these creates and returns a "compatibility array." All values in the array should be between 0 and 100, inclusive. (In the absence of real data, you can generate random numbers.) The only difference between the methods is in the size of the returned array. The "small" array should be small enough that you can easily check the results: maybe three to five couples. The "large" array should be enough to make your program work really hard--maybe 25 or 30 couples. The "medium" array should be intermediate in size, perhaps ten to fifteen couples. Don't forget that the array might not be square--that is, it might or might not have the same number of rows as columns.

These methods are in a separate class so that we can easily replace them with our own class. In fact, don't even turn in the Compatibility class. However, your main class, DatingService, should call each of these methods and compute the results for each of three data sets (that is, each "compatibility array"): one small, one medium, and one large.

For each data set, print out:

As a rough guideline, your program should be able to complete a large array in less than a minute on a 1 GHz computer. This is very rough guideline, because computers vary so much in capability. We will have a lot of these programs to grade, so for the protection of the TA any myself, any program that takes us more than 5 minutes to execute is subject to a deduction.

Your output should be neat and self-explanatory.

Along with the program, turn in a readme.txt file. In this file,

I expect that the actual programming will be a relatively small part of this assignment; deciding on the algorithm to use, and determining its time complexity, will be more difficult. I suggest that you start with a dumb but fast algorithm (for example, always match woman i with man i), just to test out the rest of the program (timing, averages, printing, etc.), and replace it with the "real" algorithm later.

Bonus points will be awarded to the best-performing programs (probably about 10 points, possibly more for truly exceptional programs).

You can do this assignment by yourself or with a partner. I recommend having a partner, because it can be very helpful to argue about possible algorithms--but no more than two people on a team, please.

Due date:

Thursday, April 15, before midnight. Turn in, via Blackboard, (and any other classes your program uses) and readme.txt.