CIT 591 Assignment 4: Mixed-Strategy Games Fall 2003, David Matuszek

Purposes of this assignment:

• To give you more practice with classes, objects, methods, and arrays
• To introduce the notion of inheritance

General idea of the assignment:

In Game Theory, a two-person zero-sum game is a two-dimensional matrix. The maximizing player chooses a row, and at the same time, the minimizing player chooses a column. Neither player knows in advance what the other player is going to choose. The entry at the chosen row and column represents the payoff from the minimizing (column) player to the maximizing (row) player--this payoff may be negative.

In some games there is a "best" row and a "best" column that should always be chosen. In other games, such as in the following small example, the best move depends on how the opponent moves.

min

0
1
max 0
 4 -3 -5 12
1

If either player always plays the same row or column, the opposing player can take advantage of that fact. For example, if the row player always chooses the second row (hoping to get the 12), the column player can choose the first column (containing the -5). Hence, both players should use a mixed strategy, sometimes making one choice and sometimes another. However, the rows and columns shouldn't be chosen equally often; in this game, the row player should choose the first row more than twice as often as the second row, and the column player should more frequently choose the first column.

In this assignment we will not be playing the game, but instead we will compute how frequently the players should choose each row and each column, for any size array.

The math (sorry!):

We will have the two players take turns (again, we are not playing the game, we are figuring out the best mixed strategy.) To get started, the first player chooses any move--it doesn't matter which. For each following turn, the player whose turn it is chooses the best move based on how the opponent has been playing.

To implement this, we need two additional arrays; one to keep track of how many times the row player has chosen each row, and one to keep track of how many times the column player has chosen each column. In Figure 1 (below), we start by having the row player choose row 0, as indicated by the 1 in the array on the right.

min

 0 1

max
 0 1
 4 -3 -5 12
 1 0

 0 0

min

 0 1

max
 0 1
 4 -3 -5 12
 1 0

 0 1

min

 0 1

max
 0 1
 4 -3 -5 12
 5 5

 6 4

min

 0 1

max
 0 1
 4 -3 -5 12
 5 6

 6 4

Figure 1. Max arbitrarily chooses row 0. Figure 2. Min responds with column 1. Figure 3. Later in the simulation, it is Max's turn. Figure 4. Right after Max's turn, it is Min's turn.

Since the maximizing player has chosen row 0, the min player clearly should choose column 1, as shown in Figure 2.

In the general case, suppose (Figure 3) that after a while the min player has chosen column 0 six times and column 1 four times. If max had chosen row 0 every time, the payoff would be (6 * 4) + (4 * -3) = 24 - 12 = 12; but if max had chosen row 1 each time, the payoff would be (6 * -5) + (4 * 12) = -30 + 48 = 18. Since 18 > 12, max should choose row 1 this time. We record this by adding 1 to the corresponding element in the array on the right, giving Figure 4.

In the next move (min's turn, Figure 4), max has chosen row 0 five times and row 1 six times. If min had chosen column 0 every time, the payoff (to max) would be (5 * 4) + (6 * -5) = 20 - 30 = -10; but if min had chosen column 1 every time, the payoff would be (5 * -3) + (6 * 12) = -15 + 72 = 57. Since 57 > -10, min clearly prefers -10, and should choose column 0. We record this by adding 1 to column 0 in the array at the bottom, giving 7 (not shown).

This procedure eventually results in the "best" strategy for each player; the side and bottom arrays give the relative number of times each row and each column, respectively, can be played. These can be converted into percentages. In the above game, the results should be (approximately): max, [70.8%, 29.2%]; min, [62.5%, 37.5%].

In this assignment you will need several classes, as follows:

`Game`

This class is the "coordinator"--it contains the main method that arranges for everything else to happen.

The `main` method should accept two numbers from the command line (or the BlueJ dialog box): the number of rows and the number of columns, and should check to make sure the numbers are "reasonable." It should then create the game array and fill it with random integers in the range `-10` to `10`. (You already have code to do this--probably two different versions. Use whichever version is "better", that is, easier to adapt.) Print out this array.

The `main` method should also create a maximizing player and a minimizing player. After that, the rest of the work can either be done in the `main` method, or it can use the same` run() `method trick as in the previous assignment.

Next, the two players have to be called alternately to do the above computations--say, 1000 times to get a reasonable approximation. Finally, the results should be converted into percentages and printed out.

`MaxPlayer`

This class represents a maximizing player. It should extend `Player`, and should have:

• A constructor, to create the maximizing player. The constructor should take, as an argument, the array representing the game, and should save it in an instance `int[][]` variable.
• A `strategy` array, to keep track of the moves that this player has made. This should have the same number of elements as the game array has rows.
• An` chooseMove() `method, to do the arithmetic indicated above and add one to the correct location of the `strategy` array. [Suggestion: Get the structure of the program right before you attempt this part. You can start with a "stub" method that always chooses row 0, and fix this when everything else works.]
• A` getStrategy()` method that returns the current `strategy` array. This is needed by the minimizing player for its computations, and also at the end so that percentages can be computed.
• Any other methods that you find to be useful.

`MinPlayer`

This class represents a minimizing player. It also extends `Player`, and should be very similar to the `MaxPlayer`, except that its strategy is for columns, and it prefers smaller payoffs rather than larger ones.

`Player`

Methods for `MaxPlayer` and `MinPlayer` are very similar. Unfortunately, it is important to avoid, as much as possible, duplicating code.

• If `MaxPlayer` and `MinPlayer` have an identical method, move that method to the `Player` class let it be inherited by `MaxPlayer` and `MinPlayer`.
• If `MaxPlayer` and `MinPlayer` have very similar methods that can be made identical by adding a parameter, add that parameter and move the (now identical) method to the `Player` class.
• If otherwise different methods in `MaxPlayer` and `MinPlayer` have the same "chunk" of code in each, especially if that code is at all complex (more than a couple of lines), make it into a method and move that method to the `Player` class.
• Don't try to move everything to the Player class. There are good reasons for `MaxPlayer` and `MinPlayer` to be different.

Due date:

Your program is due before midnight, Thursday October 9. Zip up all files (`.java`, `.class`, and any extra files produced by BlueJ, such as `.pkg` and `.pkh` files), and submit via Blackboard. Please turn in one program for your partnership.