00:00:15.830
Hi everybody, I'm Adam and this talk is called "Beating Mastermind." I'm a Group Engineering Lead and Community Lead at Braintree Payments.
00:00:24.449
The code that I'm talking about here, the slides I'm going to show you, and my speaker notes will all be posted to GitHub after the talk.
00:00:36.980
So this talk is about a game called Mastermind, but really, it's about being comfortable with algorithms and comfortable with taking a paper that describes an algorithm and turning it into code.
00:00:48.120
I'm going to talk a little bit about one specific algorithm called minimax. So what is Mastermind? It is a two-player code-breaking board game. This picture you see on the screen is exactly what the one I had as a kid looked like.
00:00:56.190
One player, who is the code maker, makes up a pattern of colored pegs and keeps it hidden. You can see that behind a screen at the bottom left of that picture.
00:01:08.070
The other player is the code breaker, and they try to guess that code, getting feedback in the form of little black and white pegs after each guess. You can see a bunch of guesses in the center of the screen and black and white pegs next to them, indicating how close to correct that guess is.
00:01:20.100
It's based on a game that is at least a hundred years old called Bulls and Cows. No one knows quite how old that game is, but it dates back to at least the late 1800s.
00:01:39.390
Here's an example game I'm going to walk you through. The secret pattern is at the bottom of the screen, hidden from us. While playing the game, you wouldn’t actually know what it was anyway.
00:01:55.829
The guesses start from the top. In the very first guess, you can see there's no circles to the right, meaning it got no score.
00:02:01.799
So my guess of four red pegs means none of those guesses were correct. The second guess of four yellow pegs has two black circles to the right, indicating that two of the pegs in the correct answer are yellow pegs.
00:02:15.360
However, we don't have any information about where those pegs are yet because we guessed yellow in every position, so any two of those could be correct.
00:02:36.270
In the third guess, I switched to guessing two yellow and two green pegs. Receiving one black peg and one white peg as feedback means that one of those is of the correct color and in the correct position, while one is of the correct color but in the wrong position.
00:02:44.700
Since I already know that exactly two of the yellow pegs are right, both greens must be wrong. One of the yellows is in the correct position, while the other yellow is in the wrong position.
00:03:16.500
So in my fourth guess, I switched from green to blue pegs and moved one of the two yellow ones. Now, I received three black pegs, meaning that three of these are both correct and in the correct position. We know that both yellow pegs are now in the right position, and one blue peg is correct.
00:03:39.390
However, we don't know which one is correct; all we know is that one is right and one is wrong. Then we take out one of the blue pegs and add an orange peg, again getting back two black pegs. This indicates that the two yellow pegs are both correct in color and position.
00:03:54.000
The white one means that the blue peg is the right color but is in the wrong spot. We can conclude that the orange peg does not receive a peg, indicating that it must be wrong.
00:04:10.530
Finally, in my last guess, I guessed yellow, purple, yellow, blue and received four black pegs in response, meaning I was correct. Hopefully, that gives you an idea of how the game plays out.
00:04:21.090
You try to progressively get closer and closer to the right answer. Sometimes you might take a step backward and receive a worse score, but you gather more information that helps you achieve a better score next time.
00:04:36.120
The screenshot I showed is from Simon Tom's Portable Puzzle Collection, which is a wonderful collection of logic puzzles and games that are either downloadable or playable online.
00:04:44.500
You can find them and play them in JavaScript right in the browser. Seeing Mastermind on the page, it was called 'Guess Four' for trademark reasons, which inspired this work and this talk.
00:05:05.320
I encourage everyone to check this site out; there are plenty of fun games for programmers here, including a few you might recognize on the screen, along with many others that are unique and that I have never seen elsewhere.
00:05:16.800
While I initially wanted to show you more pictures of the game from this site because they are colorful and easy to read, I decided to showcase my implementation running in the console. This will allow you to see the kind of feedback I was getting as I wrote the code we're discussing.
00:05:30.580
Let's talk about the paper that I referenced in the talk's title. It was written by Donald Knuth and published in the Journal of Recreational Mathematics in 1976.
00:05:44.500
Donald Knuth is known as the father of the analysis of algorithms. He created the TeX typesetting system and wrote The Art of Computer Programming, a multi-volume series focused on algorithmic programming.
00:05:59.860
In this paper, he proposes a solution that will always win at Mastermind in five moves or less. The issue is that the answer is presented merely as a giant lookup table, referred to as Figure 1, and this isn't even the entire thing.
00:06:14.680
This is only the first half. While the paper includes a brief description of how to use this lookup table, it doesn't thoroughly explain how or why it works.
00:06:20.710
Looking at this doesn't give me a better understanding of Mastermind and its structure. It might be possible for a person to memorize this, but I wouldn't want to try doing so for two pages of it. We're here to understand what's going on.
00:06:33.760
We want to analyze this for programmers and know how it works. The paper basically has seven parts. The first part describes the game, the second explains how to use the lookup table, and the third part provides several examples.
00:06:45.400
The fourth part is particularly interesting, as it describes why the algorithm in the paper picked a certain guess in the previous example. I'm going to quote from the paper here, and I'll read it aloud: 'Incidentally, the fourth move 1462 in this example is really a brilliant stroke—a crucial play. If a win in five moves is going to be guaranteed, none of the seven code words satisfying the first three patterns could successfully be used as the fourth test pattern.'
00:07:13.599
A code word that cannot possibly win in four is required here in order to guarantee a victory in five.
00:07:18.970
What this is saying is that sometimes we must guess combinations of colors that we know are impossible in order to eliminate enough potential answers. This enables us to win within five moves: a code word which cannot possibly win in four is necessary in order to win in five. The number 1462 represents colors—not colors like blue, red, or green, but numerical representations—so 1462 is simply a representation of four pegs using the numbers one to six.
00:07:50.590
This idea fascinates me. I never consciously applied this strategy in Mastermind myself until I read this paper. The concept that an impossible move might actually be more beneficial than one that could be the correct answer is intriguing.
00:08:01.210
Next, I want to touch on Part 5 of the paper, which is a core component. I'm going to read it as well, so please try to follow along; it's a bit dense. 'Figure 1 was found by choosing at every stage a test pattern that minimizes the maximum number of remaining possibilities, overall 15 responses by the code maker. If this minimum can be achieved by a valid pattern making for black hits possible, a valid one should be used.'
00:08:53.480
The first such test pattern in numeric order was selected, and fortunately, this procedure guarantees a win in five moves. Let’s break this down.
00:09:08.570
There are 15 possible responses from the code maker associated with the potential guesses. You want to minimize the maximum remaining possibilities, which means choosing the score that leads you to the least likely scores. We don't want numerous leftover scores that are possible outcomes, as that wouldn't help in our next choice.
00:09:37.270
So, if there are 16 different possibilities and one score could be valid for 15 of those outcomes, that isn't the smart guess. We want to choose a guess where the very worst-case score minimizes possibilities remaining from 16 to approximately 8. Therefore, we want to prefer valid patterns over invalid ones when reasonable.
00:10:01.750
Just to summarize, if two different guesses are equally likely to leave us with scores reducing from 16 to 15, we prefer the guess that itself could be a right answer. This way, we aim for a potential win in four moves rather than five.
00:10:18.150
The second condition is to simply use the first test pattern in numeric order, preferring one, one, one, one over six, three, four, five. Using numbers allows us to represent our various combinations of colors, which is also a great way to break ties.
00:10:35.860
Now, we need to figure out how to actually implement the minimax algorithm.
00:10:50.850
The sustainable answer is to search for it on Google, like everyone here does millions of times in their work. In this case, the fifth result is a Wikipedia article titled 'Minimax.'
00:11:00.690
For those who might not love using Wikipedia, don't worry! One of the top answers will also be a Stack Overflow question, which can generally be very useful. Because I find the Wikipedia article somewhat more helpful, that's the one I'll reference.
00:11:14.680
If you're trying to understand this as a programmer, it’s important to feel comfortable using Google to find the right information. However, this doesn't mean you should do it blindly. Pay attention to what you read and seek clarification as necessary.
00:11:26.400
Before I introduce the exact definition of minimax, it’s worth noting that practically any combination of keywords relevant to this section of the paper can lead you to helpful explanations.
00:11:34.830
So I'm not going to go into the exact definition of minimax right now—you're welcome to refer to Wikipedia to find more details. As a programmer, I often find it more beneficial to see pseudocode.
00:11:52.390
So let's jump to that. The algorithm looks promising. Let's walk through it together.
00:12:03.760
The function minimax takes a node, a depth, and a boolean indicating whether you're the maximizing player.
00:12:09.593
The node represents our current location in the game and the possibilities we've already excluded, as well as the score from our last guess.
00:12:16.480
The depth shows how much further ahead in the game we want to look to see which guess is best. We'll discuss that last parameter in a moment.
00:12:27.930
To understand the depth: remember what it says in the paper; we want to evaluate which guess is best based on the scores we might receive in responses.
00:12:36.920
So, when we first call this function, we look one guess and one score ahead.
00:12:42.240
If the depth is zero or if the node is a terminal node, then return the heuristic value for that node.
00:12:51.640
If we've already examined ahead for the guess and answer, or we reach a terminal node (which we won't), we return the heuristic value.
00:13:00.430
This heuristic value stems from the minimum of the maximum possibilities. The goal here is to determine if the guess is valid or invalid.
00:13:06.870
For whatever numbers we used to represent our colors, we need to sort numerically to break the ties.
00:13:21.800
Now we will examine the middle portion of the code. Note that we identify who the maximizing player is and who the minimizing player is.
00:13:30.890
As the codebreakers, our goal is to minimize the number of remaining possibilities to get closer to the correct answer.
00:13:39.320
However, the code maker aims to maximize the number of remaining possibilities, ensuring as many unknowns as possible.
00:13:52.210
If we're the maximizing player, we go through the various children nodes of the current position to find the maximum score.
00:14:07.780
We compute the maximum by recursing one level deeper into the function, where the depth is now zero.
00:14:16.610
We've also defined 'negative infinity' as our baseline, comparing pairwise maximums of values in the recursive calculations.
00:14:26.240
By utilizing Ruby, or other programming languages, we can leverage built-in methods to simplify calculations. Move into the minimizing player aspect, which is just the opposite.
00:14:38.220
The minimizing player starts with positive infinity, considering minimums from recursing into the same function.
00:14:51.020
Simply put, we're taking minimum values as the minimizing player while recursively seeking the heuristic.
00:15:02.310
Throughout this recursion, you're returning the heuristic value based on the player currently represented.
00:15:09.380
To summarize the process: we initiate the function call with a depth of two while minimizing the maximum possibilities.
00:15:18.790
In each call designed for the maximizing player, we again determine maximum values and return those values back.
00:15:26.590
The minimizing player will then look at all the maximums and compute the overall minimum. Therefore, we compute the minimized maximum potential worst-case outcomes.
00:15:49.240
At this point, we’ve completed the hard work, having analyzed the paper and derived an understanding of the algorithm. We've walked through its basic implementation based on what we found.
00:16:01.760
So now it's time to translate our understanding into code. This can condense down to about 60 or 100 lines of code.
00:16:08.840
Some lines might look terse for slides, while others are more expanded to convey the message.
00:16:16.500
Don’t assume this presentation reflects the best coding practices; it’s optimized for visual clarity during this talk.
00:16:29.330
I’ll begin by demonstrating the basic logic needed for Mastermind. First, when we start a new game, we initialize our parameters. We need to define a randomly generated correct answer.
00:16:43.350
This will be concealed from the player; we pick a random number between one and six to represent the various colors and concatenate them to form a string, which could look something like '2134'.
00:17:03.779
We also will convert lists of all possible scores and answers, allowing us to reference valid options later.
00:17:11.430
Next, we create the main structure for the Mastermind game. As long as we haven’t made ten guesses and there are guesses left to make, we call the function to make a guess.
00:17:28.580
If the list of possible answers contains our last guess, suggesting it's a valid guess, increment the guess counter.
00:17:41.830
Next, we calculate the score for that guess. If the score equals four black pegs, our guess was correct. We'll print what the answer was and how many guesses it took.
00:17:52.150
It’s worth noting that the code currently doesn’t handle losing, as our algorithm ensures a win within five moves or less.
00:18:03.300
I promise I considered the scenario where I wouldn't guess 'bbbb'; should you implement this yourself, ensure you handle those situations.
00:18:19.980
Another essential part of the function is calculating the score resulting from a given combination of guess and answer.
00:18:27.640
I set up various variables to gather the score, including lists for the pegs that are incorrect in the guess and the ones from the answer.
00:18:46.590
By pairing each peg from the guess and answer, I can compare them. If they match, I score a black peg.
00:19:01.540
If not, I record any mismatches that occur.
00:19:12.370
I then review the list of incorrect pegs to see if any answer pegs match a color in a different spot, granting me a white peg.
00:19:33.740
For example, if my guess was blue in the first slot but the actual blue is in the third slot, I would score a white peg.
00:19:43.450
Finally, I return the calculated score. That encapsulates the game: evaluate the guess, calculate the score, and proceed until the answer is accurate.
00:19:54.550
Before we jump into the specific algorithm from the paper, I want to show you what this interactive game looks like.
00:20:18.270
This function allows guesses directly from the command line. As an example, I'll run the program, which will present the current game state.
00:20:38.300
After making a guess, the program prints how many guesses are left, and provide feedback about score from the last guess.
00:21:02.870
For instance, with one type of guess, I reveal how many of those colors match the colors from the answer, allowing me to revise my guess accordingly.
00:21:18.780
With each guess, I'll receive corresponding scores indicating how many colors match, which allows a progressive approach to deduction.
00:21:33.410
Unfortunately, in instances like these, we may get lucky and immediately find our correct guess. It's about using process of elimination effectively.
00:21:44.010
After a few tries, we can figure out the secret code, and while I may occasionally win in six, the algorithm consistently aims to win in five.
00:21:56.830
So now that we have a grasp of how this works, let’s switch back to the algorithm specified in the paper.
00:22:13.040
Before implementing the minimax algorithm, we'll initialize relevant structures, as building them up in advance will be more efficient.
00:22:24.110
Start by constructing a list of all possible answers, derived from the Cartesian product of the numbers 1 to 6 repeated four times.
00:22:37.730
This encompasses everything from '1 1 1 1' to '6 6 6 6'. Then, we'll establish a mapping between each possible guess and response to the score that corresponds to its combination.
00:23:05.880
Next, we will create a set from our list of possible answers for faster lookups. This is crucial as we’ll need to execute fast checks later.
00:23:19.950
Finally, let's detail the algorithm itself. We again create the make guess function and evaluate whether this is our first guess.
00:23:30.000
If we’ve already executed prior guesses, we will filter through our initial list of possible answers using the method to eliminate non-matching responses.
00:23:46.920
Every time we receive a valid score from previous guesses, we incrementally include possible answers by matching the scores.
00:24:06.270
The objective here is to minimize the potential guesses, passing only the scores we know align with possible answers.
00:24:19.220
The scores from various answers are grouped together. This way, we limit the number of remaining possibilities to refine accuracy.
00:24:33.400
Whenever computing scores, we’ll categorize each outcome. The aim is to investigate which guess will yield the least number of leftover scores.
00:24:50.660
Ultimately, from those gathered scores, we’ll seek to find their maximum; the worst case becomes our determining factor.
00:25:03.840
Taking this information further, we implement a mechanism to check if the guess we're evaluating is possible or impossible.
00:25:19.780
Again, we’ll prioritize valid guesses and minimize the maximum number of remaining possibilities.
00:25:37.930
Based on our heuristics, we’ll be equipped to navigate the system strategically, identifying possible next moves.
00:25:53.040
To summarize this function: we evaluate possible scores and filter them down based on previous guesses, allowing us to choose the optimal remaining outcomes.
00:26:07.440
After calculating our could-be scores, our mission lies in minimizing theoretical losses while maximizing potential wins.
00:26:28.080
Finally, if it’s our first guess, we will try '1 1 2 2'. Analysis revealed that this is the only guess that guarantees a win within five moves.
00:26:48.740
This brute force method was uncovered through comprehensive exploration of all potential first moves, and we determined this one works best for success.
00:27:06.230
With that understanding, calling the play method will automate the game process. Let’s switch back to the running program.
00:27:19.240
If I run the interactive Mastermind, you can play it manually, but if you run Mastermind, it will build up data structures for every guess and answer.
00:27:36.160
The design ensures speed. After initializing, it will start playing automatically, demonstrating its capability to win consistently within five moves.
00:27:48.940
Occasionally the program will run slow—typographical errors happen, so I may have to fix issues quickly.
00:28:02.160
With complete running and potential bug issues resolved, the code will be accessible for review post-talk, and now I'll accept any questions.
00:28:44.430
You.