CIS 110 Summer 2013 - Introduction to Computer Programming

CIS 110 Home | CIS Dept Home | SEAS Home | www.upenn.edu
Grading | Late Submission/Extensions | Collaboration
Staff | Schedule | Exams | Booksite | Coding Style | Piazza | Office Hours | Tutoring | Compiler Error Messages | Run-Time Error Messages
Traveling Salesperson Problem Programming Assignment


Goals

  • Create and use (circular) linked lists
  • Learn the intricacies of looping over (circular) linked lists
  • Learn about the Traveling Salesman Problem, considered one of the most important theoretical problems in computer science.

Submission

Submit Tour.java and your readme_tsp.txt file, and answer the questions. You may also submit an extra.zip file containing your extra credit implementation and any additional files it requires. Like the NBody extra credit, this must be a zip archive (not any other compression format), even if it only contains one .java file. Our grading scripts will not work properly with any other type of submission, and you will not receive extra credit.

Background

Given N points in the plane, the goal of a traveling salesperson is to visit all of them (and arrive back home) while keeping the total distance traveled as short as possible. Implement two greedy heuristics to find good (but not optimal) solutions to the traveling salesperson problem (TSP). In doing this assignment, You should (i) learn about the notorious traveling salesperson problem, (ii) learn to use linked lists, and (iii) get more practice with data types.
          1000 points in the plane                          optimal tour (we think) - 15476.51924889754
1,000 points optimal tour

Perspective. The importance of the TSP does not arise from an overwhelming demand of salespeople to minimize their travel distance, but rather from a wealth of other applications such as vehicle routing, circuit board drilling, VLSI design, robot control, X-ray crystallography, machine scheduling, and computational biology.

Greedy heuristics. The traveling salesperson problem is a notoriously difficult combinatorial optimization problem, In principle, one can enumerate all possible tours and pick the shortest one; in practice, the number of tours is so staggeringly large (roughly N factorial) that this approach is useless. For large N, no one knows an efficient method that can find the shortest possible tour for any given set of points. However, many methods have been studied that seem to work well in practice, even though they are not guaranteed to produce the best possible tour. Such methods are called heuristics. Your main task is to implement the nearest neighbor and smallest increase insertion heuristics for building a tour incrementally. Start with a one-point tour (from the first point back to itself), and iterate the following process until there are no points left.

  • Nearest neighbor heuristic:  Read in the next point, and add it to the current tour after the point to which it is closest. (If there is more than one point to which it is closest, insert it after the first such point you discover.)

  • Smallest increase heuristic:  Read in the next point, and add it to the current tour after the point where it results in the least possible increase in the tour length. (If there is more than one point, insert it after the first such point you discover.)

Getting Started:

  • Download data.zip archive and unzip it in the folder where you will complete your assignment, just as you did for NBody. This archive contains Point.java, Node.java, client programs for each of the heuristics, the readme_tsp.txt template and a range of data files. There are several small data files in the zip archive for debugging purposes, as well as the larger data files that you need for the readme_tsp.txt questions. There is also a client program TSPTimer.java to help you do the performance analysis part of the assignment.
  • Study the Point.java API. Its main() method reads in data from standard input in the TSP input format and draws the points to standard draw.
  • Point.java represents a point in the plane, as described by the following API:
    public class Point (2D point data type)
    ---------------------------------------------------------------------------------------
           Point(double x, double y)    // create the point (x, y)
    String toString()                   // return string representation
      void draw()                       // draw point using standard draw
      void drawTo(Point that)           // draw line segment between the two points
    double distanceTo(Point that)       // return Euclidean distance between the two points
    
    Each Point object can return a string representation of itself, draw itself to standard draw, draw a line segment from itself to another point using standard draw, and calculate the Euclidean distance between itself and another point.
  • Study the Node.java represents one of these nodes, and contains a Point and a reference to the next Node in the tour:
    public class Node {
        public Point p;
        public Node next;
    }

Part I: Tour Data Type

Your task is to create a Tour data type that represents the sequence of points visited in a TSP tour. Represent the tour as a circular linked list of nodes, one for each point. Your Tour data type must implement the following API exactly:
public class Tour
----------------------------------------------------------------------------------------------
       Tour()                                   // create an empty tour
       Tour(Point a, Point b, Point c, Point d) // create a 4 point tour a->b->c->d->a
  void show()                                   // print the tour to standard output
  void draw()                                   // draw the tour to standard draw
   int size()                                   // number of points on tour
double distance()                               // return the total distance of the tour
  void insertNearest(Point p)                   // insert p using nearest neighbor heuristic
  void insertSmallest(Point p)                  // insert p using smallest increase heuristic

  • Each Tour object should be able to print its constituent points to standard output, draw its points to standard draw, count its number of points, compute its total distance, and insert a new point using either of the two heuristics. The first constructor creates an empty tour; the second constructor creates a 4-point tour and is intended to assist with debugging.
  • Warning: You will lose a substantial number of points if your methods have different behaviors or signatures than those above. You may not add public methods to the API; however, you may add private methods if you see fit. Also remember one of the main goals of this assignment is to gain experience writing and using linked structures, so you CANNOT use Java's built in LinkedList class.
  • Create a file Tour.java. Include one instance variable, say first, of type Node that is a reference to the "first" node of the circular linked list.

  • For debugging purposes only, make a constructor that takes four points as arguments, and constructs a circular linked list using those four Point objects. First, create four nodes and assign one point to each. Then, link the nodes one to another in a circle.

  • Implement the method show(). It should traverse each Node in the circular linked list, starting at first, and printing each Point using StdOut.println(). This method requires only a few lines of code, but it is important to think about it carefully, because debugging linked-list code is notoriously difficult and frustrating. Start by just printing out the first Point. With circular linked-lists the last node on the list points back to the first node, so watch out for infinite loops. If the tour has 0 points, you should not write anything to standard output.

  • Test your method by writing a main() function that defines four points, creates a new Tour object using those four points, and calls its show() method. Below is a suggested four point main(). Use it for testing.
      // main method for testing
      public static void main(String[] args) {
        // define 4 points forming a square
        Point a = new Point(100.0, 100.0);
        Point b = new Point(500.0, 100.0);
        Point c = new Point(500.0, 500.0);
        Point d = new Point(100.0, 500.0);
    
        // Set up a Tour with those four points
        // The constructor should link a->b->c->d->a
        Tour squareTour = new Tour(a, b, c, d);
    
        // Output the Tour
        squareTour.show();
      }
    
    If your method is working properly, you will get the following output for the 4 city problem above.
    (100.0, 100.0)
    (500.0, 100.0)
    (500.0, 500.0)
    (100.0, 500.0)
    
    Test your method show() on tours with 0, 1 and 2 points to check that it still works. You can create such instances by modifying the 4-node debugging constructor to only link 0, 1 or 2 of the four points to the Tour. (If you don't define first, you will have an empty Tour. If you define first and link it back to itself, you will have a 1 point Tour.)

    Put the debugging constructor back to the original four point Tour before continuing.

  • Implement the method size(). It is very similar to show(). If the tour has 0 points, you should return 0.

  • Implement the method distance(). It is very similar to show(), except that you will need to invoke the method distanceTo() in the Point data type. Add a call to the Tour distance() method in the main() and print out the distance and size. The 4-point example has a distance of 1600.0. If the tour has 0 points, distance() should return 0.0.

  • Implement the method draw(). It is also very similar to show(), except that you will need to invoke the method drawTo() in the Point data type. If the tour has 0 points, you should not draw anything to standard draw. You will also need to include the statements
     
    StdDraw.setXscale(0, 600);
    StdDraw.setYscale(0, 600);
     
    in your main() before you call the Tour draw() method. The four point example above should produce a square.
After arriving at this point, you should feel a sense of accomplishment: working with a linked list is quite difficult at first!

Part II: Creating the Tour

  • For this crucial step, you must write a method to insert one point p into the tour. As a warmup, implement a simpler method named insertInOrder() to insert a point into the tour after the point that was added last. (It is OK to leave this particular extra method in your Tour code, even though it is not part of the API and the submission script will complain.) To do this, write a loop to find the last point, and insert the new point into a new Node after that last point. It's ok (and expected) that you'll have a special case for the very first point ever inserted, but you shouldn't need any other special cases. To test this method, use the supplied client program, OrderInsertion.java and one of the shorter input files. The order of the points output should match the order of the points in the input file.

  • Implement insertNearest(). To determine which node to insert the point p after, compute the Euclidean distance between each point in the tour and p by traversing the circular linked list. As you proceed, store the node containing the closest point and its distance to p. After you have found the closest node, create a node containing p, and insert it after the closest node. This involves changing the next field of both the newly created node and the closest node. As a check, tsp10-nearest.ans cotains resulting tour for the 10 city problem which has distance 1566.1363. Note that the optimal tour tsp10-optimal.ans has distance 1552.9612 so this rule does not, in general, yield the best tour. Look below to get instructions on how to test insertNearest().

  • After doing the nearest insertion heuristc, you should be able write the method insertSmallest() by yourself, without any hints. The only difference is that you want to insert the point p where it will result in the least possible increase in the total tour length. As a check, tsp10-smallest.ans contains the result tour, which has distance 1655.7462. In this case, the smallest insertion heuristic actually does worse than the nearest insertion heuristic (although this is not typical).

  • Input and testing. The input format will begin with two integers w and h, followed by pairs of x- and y-coordinates. All x-coordinates will be real numbers between 0 and w; all y-coordinates will be real numbers between 0 and h. Many test data files are available. As an example, tsp1000.txt contains the following data:
    % more tsp1000.txt
    775 768
    185.0411 457.8824
    247.5023 299.4322
    701.3532 369.7156
    563.2718 442.3282
    144.5569 576.4812
    535.9311 478.4692
    383.8523 458.4757
    329.9402 740.9576
    ...
    254.9820 302.2548
    
  • After implementing Tour.java, use the client program NearestInsertion.java to read in the points from standard input, run the nearest neighbor heuristic; print the resulting tour, its distance, and its number of points to standard output; and draw the resulting tour to standard draw. SmallestInsertion.java is analogous but runs the smallest insertion heuristic.


             
    % java NearestInsertion < tsp1000.txt
    (185.0411, 457.8824)
    (198.3921, 464.6812)
    (195.8296, 456.6559)
    (216.8989, 455.126)
    (213.3513, 468.0186)
    (241.4387, 467.413)
    (259.0682, 473.7961)
    (221.5852, 442.8863)
    ...
    (264.57, 410.328)
    Tour distance = 27868.7106
    Number of points = 1000
    
                       
    % java SmallestInsertion < tsp1000.txt
    (185.0411, 457.8824)
    (195.8296, 456.6559)
    (193.0671, 450.2405)
    (200.7237, 426.3461)
    (200.5698, 422.6481)
    (217.4682, 434.3839)
    (223.1549, 439.8027)
    (221.5852, 442.8863)
    ...
    (186.8032, 449.9557)
    Tour distance = 17265.6282
    Number of points = 1000
    
    1000 points nearest 1000 points smallest

  • If you want to see your heuristic in action, just redraw the tour after each insertion. See the instructions in SmallestInsertion.java. It could take a while on a big input file, so you might want to modify it so that it redraws only after every 20 insertions or so.
  • Note: Remove or comment out any debugging print statements before testing the large input files or your program will take forever to run. Large inputs will produce so much debug output you won't be able to interpret it anyway.

  • If you'd like to check your work, for usa13509.txt we get distances of 77449.9794 and 45074.7769 for nearest insertion and smallest insertion, respectively. For circuit1290.txt we get 25029.7905 and 14596.0971.

Part III: Analysis

Estimate the running times of your programs as a function of the number of points N.
  • Using TSPTimer.java, run the two heuristics for N = 1,000, and repeatedly double N until the execution time exceeds 100 seconds. Document the times in your readme file.
  • When timing your programs, don't animate the results on the screen or this may become the bottleneck.
  • Find formulas of the type a × Nb that predict the amount of time your programs will takes for different numbers of cities (review how we counted the number of steps to sort an array using different algorithms), and explain it in your readme file.

Contest and extra credit.

Implement a better heuristic. For example, observe that any tour with paths that cross can be transformed into a shorter one with no crossing paths: add that improvement to your program. Here are some other ideas. The speed of your program may vary depending on your computer, but in general it should be at most one minute for tsp1000.txt. You are also not required to use the Tour data type or linked lists for the extra credit. If you are shooting for an A or A+, this is a good opportunity to impress us. However, be warned, this is a difficult extra credit. Answer the relevant questions in the readme_tsp.txt file. We will award a special prize to the student who finds the shortest tour around the 1,000-point set.

Enrichment.

  • The best known tour for tsp1000.txt is a solution of length 15476.519, which was found using the Concorde TSP solver.
  • Georgia Tech's TSP site contains a wealth of interesting information including many applications of the TSP and two TSP games.
  • Here's a 13,509 city problem that contains each of the 13,509 cities in the continental US with population over 500. The optimal solution was discovered in 1998 by Applegate, Bixby, Chvatal and Cook using theoretical ideas from linear and integer programming. The following 15,112 city problem was solved to optimality in April, 2001, and is the current world record. It took 585,936,700 CPU seconds (along with alot of cleverness) to find the optimal tour through 15,112 cities in Germany.
  • Some folks even use the TSP to create and sell art. Check out Bob Bosch's page. You can even make your own TSP artwork.
  • Here is a New York Times article on finding an optimal traveling politician tour in the state of Iowa.
  • Here's a survey article on heuristics for the TSP.
This assignment was developed by Bob Sedgewick and Kevin Wayne.
Copyright © 2000 Robert Sedgewick