| CIT 591 Assignment 4: Mixed-Strategy Games Fall 2003, David Matuszek |
Purposes of this assignment:
General idea of the assignment:
In Game Theory, a two-person zero-sum game is a two-dimensional matrix. The maximizing player chooses a row, and at the same time, the minimizing player chooses a column. Neither player knows in advance what the other player is going to choose. The entry at the chosen row and column represents the payoff from the minimizing (column) player to the maximizing (row) player--this payoff may be negative.
In some games there is a "best" row and a "best" column that should always be chosen. In other games, such as in the following small example, the best move depends on how the opponent moves.
|
min
|
|||||||
|
0
|
1
|
||||||
| max | 0 |
|
|||||
| 1 | |||||||
If either player always plays the same row or column, the opposing player can take advantage of that fact. For example, if the row player always chooses the second row (hoping to get the 12), the column player can choose the first column (containing the -5). Hence, both players should use a mixed strategy, sometimes making one choice and sometimes another. However, the rows and columns shouldn't be chosen equally often; in this game, the row player should choose the first row more than twice as often as the second row, and the column player should more frequently choose the first column.
In this assignment we will not be playing the game, but instead we will compute how frequently the players should choose each row and each column, for any size array.
The math (sorry!):
We will have the two players take turns (again, we are not playing the game, we are figuring out the best mixed strategy.) To get started, the first player chooses any move--it doesn't matter which. For each following turn, the player whose turn it is chooses the best move based on how the opponent has been playing.
To implement this, we need two additional arrays; one to keep track of how many times the row player has chosen each row, and one to keep track of how many times the column player has chosen each column. In Figure 1 (below), we start by having the row player choose row 0, as indicated by the 1 in the array on the right.
|
|
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Figure 1. Max arbitrarily chooses row 0. | Figure 2. Min responds with column 1. | Figure 3. Later in the simulation, it is Max's turn. | Figure 4. Right after Max's turn, it is Min's turn. |
Since the maximizing player has chosen row 0, the min player clearly should choose column 1, as shown in Figure 2.
In the general case, suppose (Figure 3) that after a while the min player has
chosen column 0 six times and column 1 four times. If max had chosen row 0 every
time, the payoff would be
In the next move (min's turn, Figure 4), max has chosen row 0 five times and
row 1 six times. If min had chosen column 0 every time, the payoff (to max)
would be
This procedure eventually results in the "best" strategy for each player; the side and bottom arrays give the relative number of times each row and each column, respectively, can be played. These can be converted into percentages. In the above game, the results should be (approximately): max, [70.8%, 29.2%]; min, [62.5%, 37.5%].
Your assignment:
In this assignment you will need several classes, as follows:
Game
This class is the "coordinator"--it contains the main method that arranges for everything else to happen.
The
mainmethod should accept two numbers from the command line (or the BlueJ dialog box): the number of rows and the number of columns, and should check to make sure the numbers are "reasonable." It should then create the game array and fill it with random integers in the range-10to10. (You already have code to do this--probably two different versions. Use whichever version is "better", that is, easier to adapt.) Print out this array.The
mainmethod should also create a maximizing player and a minimizing player. After that, the rest of the work can either be done in themainmethod, or it can use the samerun()method trick as in the previous assignment.Next, the two players have to be called alternately to do the above computations--say, 1000 times to get a reasonable approximation. Finally, the results should be converted into percentages and printed out.
MaxPlayer
This class represents a maximizing player. It should extend
Player, and should have:
- A constructor, to create the maximizing player. The constructor should take, as an argument, the array representing the game, and should save it in an instance
int[][]variable.- A
strategyarray, to keep track of the moves that this player has made. This should have the same number of elements as the game array has rows.- An
chooseMove()method, to do the arithmetic indicated above and add one to the correct location of thestrategyarray. [Suggestion: Get the structure of the program right before you attempt this part. You can start with a "stub" method that always chooses row 0, and fix this when everything else works.]- A
getStrategy()method that returns the currentstrategyarray. This is needed by the minimizing player for its computations, and also at the end so that percentages can be computed.- Any other methods that you find to be useful.
MinPlayer
This class represents a minimizing player. It also extends
Player, and should be very similar to theMaxPlayer, except that its strategy is for columns, and it prefers smaller payoffs rather than larger ones.
Player
Methods for
MaxPlayerandMinPlayerare very similar. Unfortunately, it is important to avoid, as much as possible, duplicating code.
MaxPlayer and MinPlayer have an identical method,
move that method to the Player class let it be inherited
by MaxPlayer and MinPlayer.MaxPlayer and MinPlayer have very similar methods
that can be made identical by adding a parameter, add that parameter and move
the (now identical) method to the Player class.MaxPlayer and MinPlayer
have the same "chunk" of code in each, especially if that code is
at all complex (more than a couple of lines), make it into a method and move
that method to the Player class.MaxPlayer and MinPlayer to be different.Due date:
Your program is due before midnight, Thursday October 9. Zip up all files (.java,
.class, and any extra files produced by BlueJ, such as .pkg
and .pkh files), and submit via Blackboard. Please turn in one
program for your partnership.