CIT 591 Assignment 4: Mixed-Strategy Games
Fall 2003, David Matuszek

Purposes of this assignment:

General idea of the assignment:

In Game Theory, a two-person zero-sum game is a two-dimensional matrix. The maximizing player chooses a row, and at the same time, the minimizing player chooses a column. Neither player knows in advance what the other player is going to choose. The entry at the chosen row and column represents the payoff from the minimizing (column) player to the maximizing (row) player--this payoff may be negative.

In some games there is a "best" row and a "best" column that should always be chosen. In other games, such as in the following small example, the best move depends on how the opponent moves.

   
min
   
0
1
max 0
4
-3
-5
12
1

If either player always plays the same row or column, the opposing player can take advantage of that fact. For example, if the row player always chooses the second row (hoping to get the 12), the column player can choose the first column (containing the -5). Hence, both players should use a mixed strategy, sometimes making one choice and sometimes another. However, the rows and columns shouldn't be chosen equally often; in this game, the row player should choose the first row more than twice as often as the second row, and the column player should more frequently choose the first column.

In this assignment we will not be playing the game, but instead we will compute how frequently the players should choose each row and each column, for any size array.

The math (sorry!):

We will have the two players take turns (again, we are not playing the game, we are figuring out the best mixed strategy.) To get started, the first player chooses any move--it doesn't matter which. For each following turn, the player whose turn it is chooses the best move based on how the opponent has been playing.

To implement this, we need two additional arrays; one to keep track of how many times the row player has chosen each row, and one to keep track of how many times the column player has chosen each column. In Figure 1 (below), we start by having the row player choose row 0, as indicated by the 1 in the array on the right.

   
min
 
   
0 1
 
max
0
1
4
-3
-5
12
1
0
   
0
0
 
   
min
 
   
0 1
 
max
0
1
4
-3
-5
12
1
0
   
0
1
 
   
min
 
   
0 1
 
max
0
1
4
-3
-5
12
5
5
   
6
4
 
   
min
 
   
0 1
 
max
0
1
4
-3
-5
12
5
6
   
6
4
 
Figure 1. Max arbitrarily chooses row 0. Figure 2. Min responds with column 1. Figure 3. Later in the simulation, it is Max's turn. Figure 4. Right after Max's turn, it is Min's turn.

Since the maximizing player has chosen row 0, the min player clearly should choose column 1, as shown in Figure 2.

In the general case, suppose (Figure 3) that after a while the min player has chosen column 0 six times and column 1 four times. If max had chosen row 0 every time, the payoff would be (6 * 4) + (4 * -3) = 24 - 12 = 12; but if max had chosen row 1 each time, the payoff would be (6 * -5) + (4 * 12) = -30 + 48 = 18. Since 18 > 12, max should choose row 1 this time. We record this by adding 1 to the corresponding element in the array on the right, giving Figure 4.

In the next move (min's turn, Figure 4), max has chosen row 0 five times and row 1 six times. If min had chosen column 0 every time, the payoff (to max) would be (5 * 4) + (6 * -5) = 20 - 30 = -10; but if min had chosen column 1 every time, the payoff would be (5 * -3) + (6 * 12) = -15 + 72 = 57. Since 57 > -10, min clearly prefers -10, and should choose column 0. We record this by adding 1 to column 0 in the array at the bottom, giving 7 (not shown).

This procedure eventually results in the "best" strategy for each player; the side and bottom arrays give the relative number of times each row and each column, respectively, can be played. These can be converted into percentages. In the above game, the results should be (approximately): max, [70.8%, 29.2%]; min, [62.5%, 37.5%].

Your assignment:

In this assignment you will need several classes, as follows:

Game

This class is the "coordinator"--it contains the main method that arranges for everything else to happen.

The main method should accept two numbers from the command line (or the BlueJ dialog box): the number of rows and the number of columns, and should check to make sure the numbers are "reasonable." It should then create the game array and fill it with random integers in the range -10 to 10. (You already have code to do this--probably two different versions. Use whichever version is "better", that is, easier to adapt.) Print out this array.

The main method should also create a maximizing player and a minimizing player. After that, the rest of the work can either be done in the main method, or it can use the same run() method trick as in the previous assignment.

Next, the two players have to be called alternately to do the above computations--say, 1000 times to get a reasonable approximation. Finally, the results should be converted into percentages and printed out.

MaxPlayer

This class represents a maximizing player. It should extend Player, and should have:

MinPlayer

This class represents a minimizing player. It also extends Player, and should be very similar to the MaxPlayer, except that its strategy is for columns, and it prefers smaller payoffs rather than larger ones.

Player

Methods for MaxPlayer and MinPlayer are very similar. Unfortunately, it is important to avoid, as much as possible, duplicating code.

Due date:

Your program is due before midnight, Thursday October 9. Zip up all files (.java, .class, and any extra files produced by BlueJ, such as .pkg and .pkh files), and submit via Blackboard. Please turn in one program for your partnership.