CIT 590/591 Assignment 3: License Plate Lookup
Fall 2012, David Matuszek

Purposes of this assignment

Backstory

The city of Pseudopolis has had a rash of hit-and-run accidents. Most recently, an 8-year old girl was hit, and is currently in intensive care; the driver fled the scene in a dark-colored car of unknown make. There was one witness, who got a glimpse of the license plate of the car. The police would very much like your assistance in helping to catch the driver, in this and in similar cases.

The problem the police have is that witnesses are not very good at reading or remembering license plate numbers. Sometimes they misread letters as digits, or vice versa. Sometimes they remember too many letters or digits, and sometimes remember too few. Sometimes they just get a letter or digit wrong.

The police cannot check every car in the city, but if they have a relatively small number of possible cars, they can check those. (Checking a car involves sending an officer out to find the car and look at it for possible collision damage.) Your job is to find those license plate numbers that most closely resemble the number reported by a witness.

General idea of the assignment

A spell checker reads your document and compares the words in it to a dictionary. If it doesn't find a word in the dictionary, it suggests a replacement word (typically, more than one).

In this assignment you will write a program that uses, not a dictionary of words, but a list of license plate numbers. Your checker will return the license plates that best match a given (possibly incorrect) license plate.

You will use a random number generator to create the list of "actual" license plate numbers. Numbers to be searched for will be entered by the user.

Details

A license plate in Pseudopolis consists of two capital letters followed by three digits, for example, AB123. All entries in the list of known Pseudopolis license plate numbers will be of this form. Since we don't have "real" data, you will use a random number generator to create this "database" of license plate numbers.

Your program needs to find license plate numbers that are "most similar" to a given number. There is no single, obvious measure of similarity. Here are the rules that the chief of police wants you to follow in deciding how different two numbers are from each other:

You should thoroughly test all functions with unittest. Please follow the sequence recommended in class. That is, write the test functions first, along with stubs; then run the tests; then start filling in the stubs with the actual code, testing frequently. I can't watch over your shoulder to make sure you do things this way, but please give it a try.

Functions

Write (at least) the following functions:

def generate_license_plate_numbers(size)
Creates and returns a list of size random license plate numbers (see below). Make sure that the list does not contain duplicate license plate numbers.
def create_database_from_list(list_of_licenses)
Creates a "database" of license plates numbers. Actually, all it really does is make a copy of the list that is given to it, and returns this copy. This is so that you can set up a "fake" database, containing known license plate numbers, for testing purposes.
def make_simple_corrections(license)
Makes simple corrections to the license plate number that is given to it, and returns the corrected license plate number. Here are the corrections to be made:
def change_digits_to_letters(license)
If digits occurs where there should be letters in the license plate number, they are converted to the most similar letters: 1I, 2Z, 3B, 4A, 5S, 6G, 7T, 8B, 9P, 0O.
def change_letters_to_digits(license)
If letters occurs where there should be digits in the license plate number, they are converted to the most similar digits: A4, B8, C0, D0, G6, I1, O0, Q0, S5, T7, Z2. Letters not specified in this list are left unchanged.
def has_exact_match(license, database)
Returns True if and only if the database (a list of license plate numbers) contains exactly the license plate number given. (Note: This is a one-line function.) If the license parameter needed some simple corrections, they have already been applied, and an exact match is still possible.
def count_nonmatching_characters(string_1, string_2)
Returns the number of characters that are different in the two strings. The two strings should already have been adjusted to be the same length (use assert to make sure this is the case), so you need only compare characters in the same positions.
def count_differences(license_from_user, license_from_database)
Returns the number of differences (as defined by the chief of police) in the two license plate numbers, starting with adjusting for length and applying simple corrections, as detailed above. This is the function that computes the complete number of differences, start to finish. The Simple corrections have already been applied to the license_from_user, and don't add to the count. The license_from_user may be as short as four characters or as long as six characters. The license_from_database is, of course, the correct length.
def find_best_matches(license, database, max_differences)
Searches the dictionary and returns a list of lists. The first sublist contains those license plate numbers that exactly match the given license; this sublist should be either empty or contain exactly one element, license. The second sublist contains those license plate numbers that differ by 1 from license; the third sublist, those that differ by two; and so on, up to and including those that differ by max_differences. This function should, of course, use the count_differences function.
def answer_yes_or_no(question)
Get a response from the user, using the question as a prompt. If the response begins with a 'y' or 'Y' (after removing spaces), return True; if it begins with an 'n' or 'N', return False; otherwise print Please answer with 'yes' or 'no.' and recursively call the function. (This is a generally useful function that you can use in other programs.)
def main()
Uses generate_license_plate_numbers to create a random dictionary of 10000 license plate numbers, then repeatedly gets a license plate number from the user and searches the database for the best match or matches. Given a possibly incorrect license plate number, as reported by a witness, your program should: After each result is printed, ask the user if he/she wants to enter another license plate number; if not, print "Done" and quit.
Please put the following at the end of your program:
 if __name__ == '__main__':
    main()
This says:
  1. If this file is loaded directly into the REPL, the function main() will be called automatically.
  2. If this file is used by another file (such as the unit test file), the function main() will not be called automatically.

Write unit tests for all the above functions, except ask_yes_or_no and main. Because these two functions do input/output, they must be tested manually.

If you find it convenient to write additional functions, start by writing unit tests for them.

Grading

It would be a good idea to read How Assignments Are Graded, but here's a summary: Do exactly what the assignment says to do, use good programming style, test your program carefully, zip up your program (if it's more than a single file), and turn it in via Blackboard. Late programs will lose points.

As this is a pair programming assignment, only one of you should submit the assignment, with both your names prominently displayed.

Due date

Your program is due before .