Assignment 5: Game Theory II Fall 2006, David Matuszek

Purposes of this assignment:

• To give you more experience with arrays
• To give you more experience with JUnit

Corrections to the assignment:

• The algorithm has to be modified, as given below. The correct algorithm is more complex (of course!), but not a lot more.
• You will find you have no need of the `chooseRandomly` method. Write and test it anyway; it's worth knowing how to do.
• The methods `chooseBestRowForMax` and `chooseBestColumnForMin` now take, as parameter, an array giving the other player's mixed strategy.

General idea of the assignment:

Combine this assignment with the previous assignment to create a fairly complete GameTheory program.

Your program should read in arrays, one at a time. For each array, the program should determine (and print out):

• The array,
• Whether the array has a saddle point, and
• if so, where the saddle point is (row and column) and what its value is, or
• if not, the mixed strategy that should be followed by each player.

You will have a different partner from the last assignment, so you should have two different versions of the `SaddlePoint` class to work with. Choose one of them--if they both work, it shouldn't make much difference which one you use.

I will provide you a class containing a method that will make it easy to read in arrays. Just use `print` and `println` statements to do the output.

Details:

In some games there is a saddle point--that is, a "best" row and a "best" column that should always be chosen. In other games, the best move depends on how the opponent moves.

If one player plays predictably, the other player can take advantage of that fact. For example, if the row player always chooses the second row (hoping to get the 12), the column player can choose the first column (containing the -5). Hence, both players should use a mixed strategy, sometimes making one choice and sometimes another. However, the rows and columns shouldn't be chosen equally often; in this game, the row player should choose the first row more than twice as often as the second row, and the column player should more frequently choose the first column. How do we know this?

In this assignment we will not be playing the game, but instead we will compute how frequently the players should choose each row and each column, for any size array.

Finding a mixed strategy

Min

0
1
2
Max 0
 1 -2 3 -6 5 -4 7 -8 9
1
2

Suppose there is no saddle point, as in this game. Then players must choose between two or more rows or columns. Also, the other player shouldn't be able to predict their moves, so they need to play somewhat randomly, but still choosing the best rows or columns more often than the worse ones. This is called a mixed strategy.

For example, a mixed strategy for Max might be `[20, 50, 30]`, meaning that Max chooses row zero 20% of the time, row one 50% of the time, and row two 30% of the time (and similarly for Min).

By the way, these numbers don't have to add up to exactly 100. If Max's strategy is ` [2, 5, 3]` or ` [400, 1000, 600]`, these are both equivalent to `[20, 50, 30]` (and to each other).

You can compute an approximate best mixed strategy by simply (sort of) playing the game a large number of times (say, a thousand). Here's how. Each player starts with a strategy of `[1, 1, 1]` (or ```[1, 1, 1, 1]```, or however many rows or columns there are to choose from). Then repeatedly do the following:

Note: Corrected algorithm from here on down

• Max looks at Min's current strategy, chooses a row based on that, and updates his own strategy accordingly.
• Min looks at Max's current strategy, chooses a column based on that, and updates her own strategy accordingly.

In the above example, suppose Max and Min both start with a strategy of `[1, 1, 1]`.

• Let's look at Max first:
• Max knows that Min's strategy is `[1, 1, 1]` -- that is, equal probabilities.
• If Max plays row `0`, he will get `1` or `-2` or `3` with equal probabilities. So ```(1*1 + 1*(-2) + 1*3)/3 = 2/3``` is his expected payoff.
• If Max plays row `1`, he will get `-6` or `5` or `-4` with equal probabilities. So ```(1*(-6) + 1*(5) + 1*(-4))/3 = -5/3``` is his expected payoff.
• If Max plays row `2`, he will get `7` or `-8` or `9` with equal probabilities. So ```(1*7 + 1*(-8) + 1*9)/3 = 8/3``` is his expected payoff.
• Max prefers the largest of these, `8/3`, so his choice is row `2`. He updates his strategy to `[1, 1, 2]`.
• Note that we are always dividing by the same number (the total of Min's `[1, 1, 1]`, which is `3`). Since we just want to know which number is the largest, we can skip the division, and just compare the numbers `2`, `-5`, and `8`.
• Now we do basically the same thing for Min:
• Min knows that Max's strategy is now `[1, 1, 2]`.
• If Min plays column `0`, she will get `1`, `-6`, or `7` with odds `1:1:2`. So `(1*1 + 1*(-6) + 2* 7)/4 = 9/4`. Or, skipping the division, `9`.
• If Min plays column `1`, she will get `-2`, `5`, or `-8` with odds `1:1:2`. So` (1*(-2) + 1*5 + 2*(-8))/4 = -13/4`; or, just use `-13`.
• If Min plays column `2`, she will get `3`, `-4`, or `9` with odds `1:1:2`. So `(1*3 + 1*(-4) + 2* 9)/4 = 17/4`. Or, just use `17`.
• Min prefers the smallest of these, `-13`, so her choice is column `1`. She updates her strategy to `[1, 2, 1]`.
• Now it's Max's turn again:
• Max knows that Min's strategy is now `[1, 2, 1]`.
• Etc.

Extended example

We will have the two players take turns (again, we are not playing the game, we are figuring out the best mixed strategy.) For each turn, the player whose turn it is chooses the best move based on how the opponent has been playing.

To implement this, we need two additional arrays; one to keep track of how many times the row player has chosen each row, and one to keep track of how many times the column player has chosen each column. To get things started, each of these arrays will be filled with `1`s. In Figure 1 (below), we start by having the row player choose row 0, as indicated by the 1 in the array on the right.

Min

 0 1

Max
 0 1
 4 -3 -5 12
 1 1

 1 1

Min

 0 1

Max
 0 1
 4 -3 -5 12
 1 2

 1 1

Min

 0 1

Max
 0 1
 4 -3 -5 12
 1 2

 2 1

Min

 0 1

Max
 0 1
 4 -3 -5 12
 2 2

 2 1

Figure 1. The initial setup (all rows and columns have equal probability). Figure 2. Max knows Min's strategy is 1:1, so he will do better with
row 1 (1*(-5)+1*12)=7 than
row 0 (1*4+1*(-3))=1.
Figure 3. Now Min knows Max's strategy is 1:2, so she will do better with
column 0 (1*4+2*(-5)) than
column 1 (1*(-3)+2*12).
Figure 4. Max's turn:
row 0: (2*4+1*(-3))=5
row 1: (2*(-5)+1*12)=2
so choose row 0.

Min

 0 1

Max
 0 1
 4 -3 -5 12
 2 2

 3 1

Min

 0 1

Max
 0 1
 4 -3 -5 12
 3 2

 3 1

Min

 0 1

Max
 0 1
 4 -3 -5 12
 3 2

 4 1

Min

 0 1

Max
 0 1
 4 -3 -5 12
 4 2

 4 1

Figure 5. Min's turn:
col 0: (2*4+2*(-5))=-2
col 1: (2*(-3)+2*12)=18
so choose column 0.
Figure 6. Max's turn:
row 0: (3*4+1*(-3))=9
row 1: (3*(-5)+1*12)=-3
so choose row 0.
Figure 7. Min's turn:
col 0: (3*4+2*(-5))=2
col 1: (3*(-3)+2*12)=15
so choose column 0.
Figure 8. Max's turn:
row 0: (4*4+1*(-3))=13
row 1: (4*(-5)+1*12)=-8
so choose row 0.

This procedure eventually results in the "best" strategy for each player; the side and bottom arrays give the relative number of times each row and each column, respectively, can be played. In the above game, if the results are given in percentages, they should be(approximately): Max, [70.8%, 29.2%]; Min, [62.5%, 37.5%].

Classes and methods

Create a class `Game`. This class should have a constructor `public Game(int[][] array)`. The constructor will save this array in a local variable, and many of the methods will use this "game."

Notice that at each turn, one of the players chooses a move randomly, according to his or her current mixed strategy. The other player then chooses the best response to that move, and updates his/her mixed strategy. To implement this, you should have the following methods:

`static int chooseRandomly(int[] weights)`
Returns a number in the range `0..weights.length`, chosen randomly according to the ratios in the `weights` array. For example, if the `weights` array is `{2, 5, 3}`, then the method will return: `0`, 20% of the time; `1`, 50% of the time; and `2`, 30% of the time.
Hint: This is most easily done with one call to `random.nextInt(int)`. Can you figure out how?
``` int chooseBestRowForMax(int[] minsStrategy) // Note change in parameter ```
Returns the row number that is best for Max, when Min uses the given strategy.
``` int chooseBestColumnForMin(int[] maxsStrategy)`````` // Note change in parameter ```
Returns the column number that is best for Min, when Max uses the given strategy.
``` void computeMixedStrategies()```
Computes the best mixed strategy for Max, and the best mixed strategy for Min, for this game. These strategies are not returned by the method, but are instead kept in `int[] `fields of this `Game` object.
``` double[] getStrategyForMax()```
If `computeMixedStrategies()` has been called for this game, return an array of `double`s containing the best mixed strategy for Max. These `double`s should sum to `1.0`. (For example, if the int array contains `{2, 7, 5, 6}`, the array returned by this method should be` {0.10, 0.35, 0.25, 0.3}`. If `computeMixedStrategies()` has not been called, this method returns `null`.
``` double[] getStrategyForMin()```
If `computeMixedStrategies()` has been called for this game, return return an array of `double`s containing the best mixed strategy for Min. These `double`s should sum to `1.0`. If `computeMixedStrategies()` has not been called, this method returns `null`.

JUnit tests

Write comprehensive JUnit tests for each of the above methods. Put these in a class named `GameTest`.

Since it's unwise to test floating-point numbers for equality (and since you won't get exact results in any case), you will need to use the following JUnit method:
```static void assertEquals(double expected, double actual, double delta) ```
where the `expected` value and the `actual` value should be within `delta` of each other in order to be considered approximately equal.

Include all your JUnit tests for the previous program, which should be in a class named `SaddlePointTest`. If you had to make any changes in your `SaddlePoint` methods, make sure your unit tests still work.

Main program

I am providing an `ArrayReader` class, in ArrayReader.java. You can use it to read in an array like this:
```     int myArray[][];      myArray = ArrayReader.getArray();```

(I'm also providing a simple ArrayReaderTest.java class with a `main` method, so you can see what `ArrayReader` does.)

Your program should use `getArray()` to read in an array. Print out the array. If the array has a saddle point, print out its row, column, and value. If the array doesn't have a saddle point, print out the mixed strategies for each player. Repeat as many times as the user desires.

If the user cancels array input (by clicking the Cancel button in the `ArrayReader`), the `getArray()` method will return `null`. Exit the program when this happens.

Before you leave the lab today, you should estimate how much time you will spend on this project; your partner should do the same. As you work on the project, keep track of the amount of time you actually spend. Include these numbers (estimated and actual time for each of you) in a file named `readme.txt`.