CIT 594 Assignment 5: Algorithms for Sorting
Spring 2012, David Matuszek

# Purposes of this assignment

• To begin the study of analysis of algorithms
• To learn some basic sorting algorithms
• To learn a little about loop invariants

# General idea of the assignment

• Implement five sorting algorithms: Bubble sort, insertion sort, selection sort, step sort, and priority sort.
• Provide JUnit 4 tests for each algorithm.
• Time each algorithm for different sizes of arrays.
• Use JRat to profile your program

# In more detail

## Implementing the algorithms

• Name your project `SortingAlgorithms`.
• Name your package `sortingAlgorithms`.
• Use the provided abstract class `SortingAlgorithm`.
```package sortingAlgorithms;

public abstract class SortingAlgorithm {

/**
* Identifies the algorithm.
* @return The name of the implementing sorting algorithm.
*/
public abstract String getName();

/**
* Sorts the given array into ascending order.
* @param array The array to be sorted.
*/
public abstract void sort(int[] array);

/**
* Swaps the values of two elements of an array.
* @param array The array.
* @param j One index.
* @param i The other index.
*/
public static void swap(int[] array, int j, int i) {
int temp = array[i];
array[i] = array[j];
array[j] = temp;
}
}```
(`SortingAlgorithm` actually started out as an interface. Most sorting algorithms swap array locations, but rather than copying and pasting the `swap` method, I changed `SortingAlgorithm` to an abstract class so I could define `swap` in it. This is an application of the DRY principle, which can be used to great effect in this assignment.)
• Implement the following classes:
• `public class BubbleSortAlgorithm extends SortingAlgorithm`
• `public class SelectionSortAlgorithm extends SortingAlgorithm`
• `public class InsertionSortAlgorithm extends SortingAlgorithm`
• `public class StepSortAlgorithm extends SortingAlgorithm`
• `public class PrioritySortAlgorithm extends SortingAlgorithm`

You do not need to write your own bubble sort, selection sort, or insertion sort methods; you can find these in any data structures book, or on the web. Find a good source, and copy the code from there. You must, however:

• Reference your source; that is, tell (in a Javadoc comment) exactly where you found the code.
• Thoroughly test the sort method. You might have copied it wrong; and you might be surprised just how easy it is to find incorrect implementations.

People have invented a lot of sorting algorithms, so for this assignment I thought I'd invent a couple myself. Probably somebody somewhere has also invented these, and named them something else. I don't care. You get to implement these two.

### Step sort

I thought about giving you pseudocode for this one, then decided I'd just implement it in Python and let you translate it. This version has been tested for all permutations of arrays containing 1..N for arrays of sizes 0 to 11.

```def stepsort(array):
step = len(array) // 2
# Note: Integer division is / in Python 2, // in Python 3
second_time = False
while step > 0:
i = 0
while i < len(array) - step:
if array[i] > array[i + step]:
array[i], array[i + step] = array[i + step], array[i] # swap
if i > step:
i = i - step - 1
i += 1
if second_time:
step = step // 2
second_time = not second_time```

### Priority sort

This one is embarassingly easy to implement in Java. Here's the algorithm:

1. Copy every value in the array into a `java.util.PriorityQueue`.
2. Copy the `PriorityQueue` back into the array.

## Testing the algorithms

• Provide JUnit 4 tests for each of the sorting methods. Be sure to test extreme cases (arrays of length 0 and length 1). For some of the algorithms, it's important to test both even-length arrays and odd-length arrays.
• Generate the Javadoc files, and look them over (you will probably find things you forgot to document).

## Timing the algorithms

Time each of your algorithms on sorting random arrays of size N, 2N, 4N, and 8N, for some large value of N.

Important note: Sorting algorithms behave differently on arrays that are already sorted, so be sure to do your timings on arrays in which the contents are in random order.

Here's the basic technique for timing code:

```long startTime = System.nanoTime();
// Code to be timed goes here
long elapsedTime = System.nanoTime() - startTime;```

The call `System.nanoTime()` requests the system to use the most accurate timer available. Many systems still use a millisecond clock, which means your nanosecond count may be rounded to the nearest 1000 nanoseconds. If something appears to take zero time, that probably means it took less than a millisecond.

Before you turn your program in, please be sure that we can run your complete timing tests in under a minute. (You may have to adjust array sizes.) The reason for this is simple--we have a lot of programs to grade.

### Timing tips

Here are some tips for helping to get reasonable accuracy.

• Close all other applications, so you don't have the computer doing too many other things during the timings.
• Reason: `System.nanoTime()` gives "wall clock" time, not just time spent in your program.
• "Warm up" your program by executing all your methods a few times before you start timing.
• Reason: The compiler may delay actually compiling your code until it is needed, so the first execution may be much slower.
• Call `System.gc()` before you begin timing.
• Reason: So garbage collection doesn't happen during your timing runs.
• Run the code multiple times, and take the average.
• For best accuracy, throw out the best and worst times, and take the average of the remaining times.
• As much as possible, try to include in your timing only the code you want to time.
• A few assignments, comparisons, etc., aren't worth worrying about.
• Filling the array with random numbers does take enough time to skew your results. Here are two possible strategies:
1. `repeat X times { randomize; start timing; sort; stop timing }`
• This strategy results in a lot of less accurate individual timings.
1. `start timing; repeat X times { randomize; sort } stop timing`
• With this strategy, you need to time the randomization method separately, and subtract it from the results.

## Profiling the program

Download and learn to use JRat. JRat is easy, but it is a command-line program. Here's what I used:

```cd workspace/SortingAlgorithms/bin
java -javaagent:shiftone-jrat.jar sortingAlgorithms/TimeSortingAlgorithms
java -Xmx256M -jar shiftone-jrat.jar```

Use JRat to profile your program, and compare the results with the timings you made. (Warning: A profiler like JRat makes your program run much more slowly, so either use smaller arrays or be prepared to wait for quite a while.)

## Write up the results

Include a `readme` file (`.txt`, `.doc`, or `.rtf`) with your submission. This file should briefly describe the details of your tests (array sizes, timing strategy, anything else you think is relevant), and conclusions.

For each of the sorting techniques,

• Provide a table of the array sizes you used and the time it took to sort those arrays. You should be able to just copy some of your program output and paste it into your `readme.txt` file.
• Tell what its Big-O running time appears to be. Use numbers from your timings to support your conclusion.

What kind of information does JRat provide? Compare this to the information provided by your timing tests.

The step sort algorithm uses two loops. Write, in English or formal logic notation or both, what the loop invariant is for each of these loops. Which of the standard sorting algorithms (bubble, insertion, selection) is most similar to step sort, and why?

You don't need to write an essay! Anything from half a page to a page should be enough.

# Due date

Your program is due before 6am February 21. Zip up the entire directory for this project, and submit via Blackboard. Only assignments submitted via Blackboard will be accepted--any assignments emailed to me will be discarded without comment. A late penalty of 5 points per day (out of 100 points) will apply.

Because many of you are interviewing this semester, a limited number of 48-hour extensions will be available. To get an extension, email me before 5pm Friday, stating the reason you need the extension. No extensions will be granted after Friday. If you get an extension and fail to get the project in by the extended due date, late points will be counted from the original due date.