Traveling Salesman Problem aka Bicycle Thief

Assignment Goals Given N cities, the goal of a traveling salesman is to visit each of them exactly once (and arrive back home) while keeping the total distance traveled as short as possible. Stated more abstractly, the goal is to find a path connecting N points in a plane that passes through each point exactly once. Implement three greedy heuristics to find good (but not optimal) solutions to the traveling salesman problem (TSP). In doing this assignment, You should (i) learn about the notorious traveling salesman problem, (ii) get more practice with linked lists, and (iii) learn to adapt them to your problem.
          1000 points in the plane                          optimal tour (we think) - 15476.51924889754
1,000 points optimal tour

Perspective. The importance of the TSP does not arise from an overwhelming demand of salespeople to minimize their travel distance, but rather from a wealth of other applications such as vehicle routing, circuit board drilling, VLSI design, robot control, X-ray crystallography, machine scheduling, and computational biology.

The traveling salesman problem is a notoriously difficult combinatorial optimization problem, In principle, one can enumerate all possible tours and pick the shortest one; in practice, the number of tours is so staggeringly large (roughly N factorial) that this approach is useless. For large N, no one knows an efficient method that can find the shortest possible tour for any given set of points. However, many methods have been studied that seem to work well in practice, even though they are not guaranteed to produce the best possible tour. Such methods are called heuristics. Your main task is to implement the in-order, nearest neighbor and smallest detour insertion heuristics for building a tour incrementally. Start with a one-point tour (from the first point back to itself), and iterate the following process until there are no points left. Download data.zip archive and unzip it in the folder where you will complete your assignment, just as you did for NBody. This archive contains Point.java, TourInterface.java, client programs for each of the heuristics, the readme_tsp.txt template and a range of data files. There are several small data files in the zip archive for debugging purposes, as well as the larger data files that you need for the readme_tsp.txt questions. There is also a client program TSPTimer.java to help you do the performance analysis part of the assignment.

Study the Point.java API. Its main() method reads in data from standard input in the TSP input format and draws the points to standard draw.

Point.java represents a point in the plane, as described by the following API:

public class Point (2D point data type)
---------------------------------------------------------------------------------------
       Point(double x, double y)    // create the point (x, y)
String toString()                   // return string representation
  void draw()                       // draw point using standard draw
  void drawTo(Point that)           // draw line segment between the two points
double distanceTo(Point that)       // return Euclidean distance between the two points
Each Point object can return a string representation of itself, draw itself to standard draw, draw a line segment from itself to another point using standard draw, and calculate the Euclidean distance between itself and another point.

Study the TourInterface.java API, which your Tour class must implement:


public interface TourInterface
------------------------------------------------------------------------------------------------------
  public String toString()        // create a String representation of the Tour.
  void draw(Point p)              // draw the tour to standard draw
                                  // any edge starting or ending at p should be in a distinct color
  int size()                     // number of points on tour
  double distance()               // return the total distance of the tour
  void insertInOrder(Point p)     // insert p at the end of the tour
  void insertNearest(Point p)     // insert p using nearest neighbor heuristic
  void insertSmallest(Point p)    // insert p using smallest increase heuristic

Remember that to implement an interface, you have to write all the above methods with the exact same method signature and return types.
Your task is to create a Tour data type that represents the sequence of points visited in a TSP tour. It must implement the TourInterface interface. That means it must have every single method in TourInterface with the same return type and the same method signature.

It is easiest done via the following steps

Create your Tour class, with empty implementations of each method in the TourInterface class and appropriate header and method comments. Your class should implement TourInterface and should compile, but none of the methods should do anything. This gives you a skeleton that you can fill in and test piece by piece. Represent the tour as a linked list, where each node contains a Point. At the end of the list, include one additional node that contains the first point on the tour. In other words, your list should contain two different nodes containing the first point on the tour, one at the beginning of the list and one at the end. See the diagram below for clarification. You may design your linked list data structure however you like, but you may not use Java's built-in LinkedList data type or anyone else's linked list code.

Our strong suggestion is to create a Node inner class and keep an instance variable (field) called Node head. For an example of this look at LinkedStackOfStrings.java which can be found on booksite at link.

Constructors

Write the following two constructors, both public:
  • public Tour(); // Create an empty Tour
  • public Tour(Point a, Point b, Point c, Point d); // Create a Tour with the four points a, b, c, d in that order. This constructor is useful for debugging. Your linked list structure for this case should match the following figure:

Implement the method toString(). It should return each Point in tour in order using the toString() method of Point. Also to add a line break between points, use the special character "\n". The first point's repeat node at the end of the list is an implementation detail and should not be counted. Once you have gotten this working, you are encouraged to modify it and/or add additional tests to main() as you add functionility to your Tour class.


  // main method for testing
  public static void main(String[] args) {
    // define 4 points forming a square
    Point a = new Point(100.0, 100.0);
    Point b = new Point(500.0, 100.0);
    Point c = new Point(500.0, 500.0);
    Point d = new Point(100.0, 500.0);

    // Set up a Tour with those four points
    // The constructor should link a->b->c->d->a
    Tour squareTour = new Tour(a, b, c, d);

    // Output the Tour
    System.out.println(squareTour);
  }

If your method is working properly, you will get the following output for the 4 city problem above.


(100.0, 100.0)
(500.0, 100.0)
(500.0, 500.0)
(100.0, 500.0)

Implement the method size() to return the total number of points in the tour. The first point's repeat node at the end of the list is an implementation detail and should not be counted. If the tour has 0 points, you should return 0. Implement the method distance(). It is somewhat similar to toString(), except that you will need to invoke the method distanceTo() in the Point data type. Add a call to Tour's distance() method in the main() and print out the distance and size. The 4-point example has a distance of 1600.0. If the tour has 0 points, distance() should return 0.0. Implement the method draw(Point p). It is also very similar to toString(), except that you will need to invoke the method drawTo() in the Point data type. If the tour has 0 points, you should not draw anything. Any edge that starts or ends at the point p should be drawn in a different color from the rest of the tour. (For example, our reference implementation draws those edges in red instead of black.) If p is null, then all edges should be drawn in the usual color. This is useful to animate which point and edges were just added when building up the tour. Refer to the documentation for StdDraw.setPenColor() on the StdDraw reference page to see how to change colors in StdDraw. You will need to include the statements


StdDraw.setXscale(0, 600);
StdDraw.setYscale(0, 600);

in your main() before you call the Tour draw() method. The four point example above should produce a square.
Implement insertInOrder() to add a point to the end of your tour. To test this method, use the supplied client program, OrderInsertion.java and specify one of the shorter input files (e.g. tsp10.txt) as a command-line argument. The order of the points output should match the order of the points in the input file. Points will get added to your tour at a rate of 1 per second. The edges leading in and out of the last point should be drawn in a different color if you implement Tour.draw() correctly. (You can speed up the animation by reducing the wait time to something smaller than 1000ms in the call to StdDraw.show() in OrderInsertion.)

You should see the following output when running insertInOrder() on tsp10.txt and tsp1000.txt (you should comment out the animation lines or speed the animation up a lot in OrderInsertion before attempting tsp1000.txt. You will also need to change a conditional in the main function that does not print tours of size greater than 20. )


         
> java OrderInsertion tsp10.txt
(110.0, 225.0)
(161.0, 280.0)
(325.0, 554.0)
(490.0, 285.0)
(157.0, 443.0)
(283.0, 379.0)
(397.0, 566.0)
(306.0, 360.0)
(343.0, 110.0)
(552.0, 199.0)
Tour distance =  2586.6759
Number of points = 10
                   
> java OrderInsertion tsp1000.txt
(185.0411, 457.8824)
...
(334.9344, 524.7824)
(396.0952, 150.0943)
(462.1732, 486.3889)
(193.5595, 527.8311)
(331.8716, 533.5717)
(748.401, 462.4281)
(94.0231, 544.9736)
(254.982, 302.2548)
Tour distance =  327693.7622
Number of points = 1000
10 points ordered 1000 points ordered
Implement insertNearest(). To determine where to insert the point p, compute the Euclidean distance between each point in the tour and p. As you proceed, store the node containing the closest point and its distance to p. After you have found the closest node, create a node containing p, and insert it after the closest node. If p is equally close to more than one point in the tour, insert it after the first such point. As a check, tsp10-nearest.ans cotains resulting tour for the 10 city problem which has distance 1566.1363. Note that the optimal tour tsp10-optimal.ans has distance 1552.9612 so this rule does not, in general, yield the best tour. Look below to get instructions on how to test insertNearest(). Implement insertSmallest(). You should add p where it will result in the shortest detour from the current tour, i.e. where it will result in the smallest increase in total tour length. If you add p between points s and t in the current tour, the length of the detour is the distances from s to p and from p to t, less the distance from s to t.

As a check, tsp10-smallest.ans contains the result tour, which has distance 1655.7462. In this case, the smallest insertion heuristic actually does worse than the nearest insertion heuristic (although this is not typical)

The input format will begin with two integers w and h, followed by pairs of x- and y-coordinates. All x-coordinates will be real numbers between 0 and w; all y-coordinates will be real numbers between 0 and h. Many test data files are available. As an example, tsp1000.txt contains the following data:


775 768
185.0411 457.8824
247.5023 299.4322
701.3532 369.7156
563.2718 442.3282
144.5569 576.4812
535.9311 478.4692
383.8523 458.4757
329.9402 740.9576
...
254.9820 302.2548

  • After implementing Tour.java, use the client program NearestInsertion.java to read in the points from standard input, run the nearest neighbor heuristic; print the resulting tour, its distance, and its number of points to standard output; and draw the resulting tour to standard draw. SmallestInsertion.java is analogous but runs the smallest insertion heuristic.


             
    > java NearestInsertion tsp1000.txt
    (185.0411, 457.8824)
    (198.3921, 464.6812)
    (195.8296, 456.6559)
    (216.8989, 455.126)
    (213.3513, 468.0186)
    (241.4387, 467.413)
    (259.0682, 473.7961)
    (221.5852, 442.8863)
    ...
    (264.57, 410.328)
    Tour distance = 27868.7106
    Number of points = 1000
    
                       
    > java SmallestInsertion tsp1000.txt
    (185.0411, 457.8824)
    (195.8296, 456.6559)
    (193.0671, 450.2405)
    (200.7237, 426.3461)
    (200.5698, 422.6481)
    (217.4682, 434.3839)
    (223.1549, 439.8027)
    (221.5852, 442.8863)
    ...
    (186.8032, 449.9557)
    Tour distance = 17265.6282
    Number of points = 1000
    
    1000 points nearest 1000 points smallest

  • Note that NearestInsertion and SmallestInsertion only print out the tour if it contains fewer than 20 points. This makes them a bit more practical for large data sets.

  • By default, NearestInsertion and SmallestInsertion will both add points at a rate of 4 points/second, and will animate the tour as it grows. If you implemented Tour.draw() correctly, the edges that were just added at each step will be a different color from your other edges. Change the delay in the call to StdDraw.show() to change the animation speed, or comment out the four lines controlling animation entirely to diable it. These lines are clearly commented in each program.

  • Note: Remove or comment out any debugging print statements before testing the large input files or your program will take forever to run. Large inputs will produce so much debug output you won't be able to interpret it anyway. If you want to animate large input files, you might also want to modify the animation to only redraw every 20 insertions.

  • If you'd like to check your work, for usa13509.txt we get distances of 77449.9794 and 45074.7769 for nearest insertion and smallest insertion, respectively. For circuit1290.txt we get 25029.7905 and 14596.0971.
Estimate the running times of your programs as a function of the number of points N.

Submit Tour.java and your readme_tsp.txt file, and answer the questions. You may also submit an extra.zip file containing your extra credit implementation and any additional files it requires. Implement a better heuristic in a class TourEC.java. For example, observe that any tour with paths that cross can be transformed into a shorter one with no crossing paths: add that improvement to your program. Here are some other ideas. The speed of your program may vary depending on your computer, but in general it should be at most five minutes for tsp1000.txt. You are also not required to use the Tour data type or TourInterface interface for you extra credit solution. Any code beyond TourEC.java necessary to run your extra credit should be included in extra.zip. You do not have to use Point.java or any of your code from the main part of the assignment. (You may also modify Point.java in any way you wish for the extra credit.) However, if you do use any of that code, you should include it in your extra.zip file or your TA may not be able to compile your program. Be warned that this is a relatively difficult extra credit, although it gives an opportunity to learn a great deal about an extremely important problem. Answer the relevant questions in the readme_tsp.txt file. We will award a special prize to the student who finds the shortest tour around the 1,000-point set.