Beating Mastermind: Winning with the help of Donald Knuth

Ruby

Adam Forsyth

#programming-techniques

#computer-science

Beating Mastermind: Winning with the help of Donald Knuth

by Adam Forsyth

In the talk titled "Beating Mastermind: Winning with the help of Donald Knuth," Adam Forsyth presents a detailed exploration of the board game Mastermind through the lens of algorithm design, specifically focusing on the minimax algorithm. The session, part of RubyConf 2018, aims to help programmers convert theoretical algorithmic concepts from academic papers into practical code implementations. Forsyth begins with an overview of Mastermind, a code-breaking game with a rich history, where players attempt to guess a hidden code based on feedback. He demonstrates gameplay mechanics, illustrating how players progressively narrow down possibilities through logical deductions and strategic guessing.

Key Points Discussed:
- Introduction to Mastermind: Forsyth explains the rules of the game, where two players engage in code-making and code-breaking, providing visual examples of possible guesses and scores received.
- Historical Reference: He references a foundational paper by Donald Knuth from 1976, which asserts that a player can guarantee a win within five moves using optimized strategies.
- Minimax Algorithm Explanation: Forsyth elaborates on the minimax algorithm, describing its application for making optimal decisions in game theory, particularly in scenarios like Mastermind where there are competing objectives.
- Algorithm Implementation: The talk includes a walkthrough of implementing the minimax algorithm, which assesses potential guesses and their outcomes to minimize the maximum number of remaining possibilities.
- Interactive Demonstration: Forsyth shows a Ruby implementation of Mastermind that utilizes the minimax algorithm to automate gameplay, demonstrating how it adapts based on previous guesses and scores.

Significant Examples:
- Forsyth highlights specific example games to emphasize the iterative guessing process, where players must balance logical deductions with strategic risk-taking. He illustrates how an impossible guess can sometimes yield beneficial results in eliminating potential scores, thus leading to victory.

Conclusions and Takeaways:
- The central message is the importance of understanding the underlying algorithmic principles in effectively solving problems encountered in games like Mastermind.
- Forsyth encourages programmers to take academic concepts seriously by translating them into practical applications, thereby gaining confidence in both coding and algorithmic thinking.
- The session concludes with Forsyth inviting the audience to explore the source code available on GitHub post-talk, reinforcing the practice of sharing knowledge and fostering community learning among developers.

00:00:15.830 Hi everybody, I'm Adam and this talk is called "Beating Mastermind." I'm a Group Engineering Lead and Community Lead at Braintree Payments.

00:00:24.449 The code that I'm talking about here, the slides I'm going to show you, and my speaker notes will all be posted to GitHub after the talk.

00:00:36.980 So this talk is about a game called Mastermind, but really, it's about being comfortable with algorithms and comfortable with taking a paper that describes an algorithm and turning it into code.

00:00:48.120 I'm going to talk a little bit about one specific algorithm called minimax. So what is Mastermind? It is a two-player code-breaking board game. This picture you see on the screen is exactly what the one I had as a kid looked like.

00:00:56.190 One player, who is the code maker, makes up a pattern of colored pegs and keeps it hidden. You can see that behind a screen at the bottom left of that picture.

00:01:08.070 The other player is the code breaker, and they try to guess that code, getting feedback in the form of little black and white pegs after each guess. You can see a bunch of guesses in the center of the screen and black and white pegs next to them, indicating how close to correct that guess is.

00:01:20.100 It's based on a game that is at least a hundred years old called Bulls and Cows. No one knows quite how old that game is, but it dates back to at least the late 1800s.

00:01:39.390 Here's an example game I'm going to walk you through. The secret pattern is at the bottom of the screen, hidden from us. While playing the game, you wouldn’t actually know what it was anyway.

00:01:55.829 The guesses start from the top. In the very first guess, you can see there's no circles to the right, meaning it got no score.

00:02:01.799 So my guess of four red pegs means none of those guesses were correct. The second guess of four yellow pegs has two black circles to the right, indicating that two of the pegs in the correct answer are yellow pegs.

00:02:15.360 However, we don't have any information about where those pegs are yet because we guessed yellow in every position, so any two of those could be correct.

00:02:36.270 In the third guess, I switched to guessing two yellow and two green pegs. Receiving one black peg and one white peg as feedback means that one of those is of the correct color and in the correct position, while one is of the correct color but in the wrong position.

00:02:44.700 Since I already know that exactly two of the yellow pegs are right, both greens must be wrong. One of the yellows is in the correct position, while the other yellow is in the wrong position.

00:03:16.500 So in my fourth guess, I switched from green to blue pegs and moved one of the two yellow ones. Now, I received three black pegs, meaning that three of these are both correct and in the correct position. We know that both yellow pegs are now in the right position, and one blue peg is correct.

00:03:39.390 However, we don't know which one is correct; all we know is that one is right and one is wrong. Then we take out one of the blue pegs and add an orange peg, again getting back two black pegs. This indicates that the two yellow pegs are both correct in color and position.

00:03:54.000 The white one means that the blue peg is the right color but is in the wrong spot. We can conclude that the orange peg does not receive a peg, indicating that it must be wrong.

00:04:10.530 Finally, in my last guess, I guessed yellow, purple, yellow, blue and received four black pegs in response, meaning I was correct. Hopefully, that gives you an idea of how the game plays out.

00:04:21.090 You try to progressively get closer and closer to the right answer. Sometimes you might take a step backward and receive a worse score, but you gather more information that helps you achieve a better score next time.

00:04:36.120 The screenshot I showed is from Simon Tom's Portable Puzzle Collection, which is a wonderful collection of logic puzzles and games that are either downloadable or playable online.

00:04:44.500 You can find them and play them in JavaScript right in the browser. Seeing Mastermind on the page, it was called 'Guess Four' for trademark reasons, which inspired this work and this talk.

00:05:05.320 I encourage everyone to check this site out; there are plenty of fun games for programmers here, including a few you might recognize on the screen, along with many others that are unique and that I have never seen elsewhere.

00:05:16.800 While I initially wanted to show you more pictures of the game from this site because they are colorful and easy to read, I decided to showcase my implementation running in the console. This will allow you to see the kind of feedback I was getting as I wrote the code we're discussing.

00:05:30.580 Let's talk about the paper that I referenced in the talk's title. It was written by Donald Knuth and published in the Journal of Recreational Mathematics in 1976.

00:05:44.500 Donald Knuth is known as the father of the analysis of algorithms. He created the TeX typesetting system and wrote The Art of Computer Programming, a multi-volume series focused on algorithmic programming.

00:05:59.860 In this paper, he proposes a solution that will always win at Mastermind in five moves or less. The issue is that the answer is presented merely as a giant lookup table, referred to as Figure 1, and this isn't even the entire thing.

00:06:14.680 This is only the first half. While the paper includes a brief description of how to use this lookup table, it doesn't thoroughly explain how or why it works.

00:06:20.710 Looking at this doesn't give me a better understanding of Mastermind and its structure. It might be possible for a person to memorize this, but I wouldn't want to try doing so for two pages of it. We're here to understand what's going on.

00:06:33.760 We want to analyze this for programmers and know how it works. The paper basically has seven parts. The first part describes the game, the second explains how to use the lookup table, and the third part provides several examples.

00:06:45.400 The fourth part is particularly interesting, as it describes why the algorithm in the paper picked a certain guess in the previous example. I'm going to quote from the paper here, and I'll read it aloud: 'Incidentally, the fourth move 1462 in this example is really a brilliant stroke—a crucial play. If a win in five moves is going to be guaranteed, none of the seven code words satisfying the first three patterns could successfully be used as the fourth test pattern.'

00:07:13.599 A code word that cannot possibly win in four is required here in order to guarantee a victory in five.

00:07:18.970 What this is saying is that sometimes we must guess combinations of colors that we know are impossible in order to eliminate enough potential answers. This enables us to win within five moves: a code word which cannot possibly win in four is necessary in order to win in five. The number 1462 represents colors—not colors like blue, red, or green, but numerical representations—so 1462 is simply a representation of four pegs using the numbers one to six.

00:07:50.590 This idea fascinates me. I never consciously applied this strategy in Mastermind myself until I read this paper. The concept that an impossible move might actually be more beneficial than one that could be the correct answer is intriguing.

00:08:01.210 Next, I want to touch on Part 5 of the paper, which is a core component. I'm going to read it as well, so please try to follow along; it's a bit dense. 'Figure 1 was found by choosing at every stage a test pattern that minimizes the maximum number of remaining possibilities, overall 15 responses by the code maker. If this minimum can be achieved by a valid pattern making for black hits possible, a valid one should be used.'

00:08:53.480 The first such test pattern in numeric order was selected, and fortunately, this procedure guarantees a win in five moves. Let’s break this down.

00:09:08.570 There are 15 possible responses from the code maker associated with the potential guesses. You want to minimize the maximum remaining possibilities, which means choosing the score that leads you to the least likely scores. We don't want numerous leftover scores that are possible outcomes, as that wouldn't help in our next choice.

00:09:37.270 So, if there are 16 different possibilities and one score could be valid for 15 of those outcomes, that isn't the smart guess. We want to choose a guess where the very worst-case score minimizes possibilities remaining from 16 to approximately 8. Therefore, we want to prefer valid patterns over invalid ones when reasonable.

00:10:01.750 Just to summarize, if two different guesses are equally likely to leave us with scores reducing from 16 to 15, we prefer the guess that itself could be a right answer. This way, we aim for a potential win in four moves rather than five.

00:10:18.150 The second condition is to simply use the first test pattern in numeric order, preferring one, one, one, one over six, three, four, five. Using numbers allows us to represent our various combinations of colors, which is also a great way to break ties.

00:10:35.860 Now, we need to figure out how to actually implement the minimax algorithm.

00:10:50.850 The sustainable answer is to search for it on Google, like everyone here does millions of times in their work. In this case, the fifth result is a Wikipedia article titled 'Minimax.'

00:11:00.690 For those who might not love using Wikipedia, don't worry! One of the top answers will also be a Stack Overflow question, which can generally be very useful. Because I find the Wikipedia article somewhat more helpful, that's the one I'll reference.

00:11:14.680 If you're trying to understand this as a programmer, it’s important to feel comfortable using Google to find the right information. However, this doesn't mean you should do it blindly. Pay attention to what you read and seek clarification as necessary.

00:11:26.400 Before I introduce the exact definition of minimax, it’s worth noting that practically any combination of keywords relevant to this section of the paper can lead you to helpful explanations.

00:11:34.830 So I'm not going to go into the exact definition of minimax right now—you're welcome to refer to Wikipedia to find more details. As a programmer, I often find it more beneficial to see pseudocode.

00:11:52.390 So let's jump to that. The algorithm looks promising. Let's walk through it together.

00:12:03.760 The function minimax takes a node, a depth, and a boolean indicating whether you're the maximizing player.

00:12:09.593 The node represents our current location in the game and the possibilities we've already excluded, as well as the score from our last guess.

00:12:16.480 The depth shows how much further ahead in the game we want to look to see which guess is best. We'll discuss that last parameter in a moment.

00:12:27.930 To understand the depth: remember what it says in the paper; we want to evaluate which guess is best based on the scores we might receive in responses.

00:12:36.920 So, when we first call this function, we look one guess and one score ahead.

00:12:42.240 If the depth is zero or if the node is a terminal node, then return the heuristic value for that node.

00:12:51.640 If we've already examined ahead for the guess and answer, or we reach a terminal node (which we won't), we return the heuristic value.

00:13:00.430 This heuristic value stems from the minimum of the maximum possibilities. The goal here is to determine if the guess is valid or invalid.

00:13:06.870 For whatever numbers we used to represent our colors, we need to sort numerically to break the ties.

00:13:21.800 Now we will examine the middle portion of the code. Note that we identify who the maximizing player is and who the minimizing player is.

00:13:30.890 As the codebreakers, our goal is to minimize the number of remaining possibilities to get closer to the correct answer.

00:13:39.320 However, the code maker aims to maximize the number of remaining possibilities, ensuring as many unknowns as possible.

00:13:52.210 If we're the maximizing player, we go through the various children nodes of the current position to find the maximum score.

00:14:07.780 We compute the maximum by recursing one level deeper into the function, where the depth is now zero.

00:14:16.610 We've also defined 'negative infinity' as our baseline, comparing pairwise maximums of values in the recursive calculations.

00:14:26.240 By utilizing Ruby, or other programming languages, we can leverage built-in methods to simplify calculations. Move into the minimizing player aspect, which is just the opposite.

00:14:38.220 The minimizing player starts with positive infinity, considering minimums from recursing into the same function.

00:14:51.020 Simply put, we're taking minimum values as the minimizing player while recursively seeking the heuristic.

00:15:02.310 Throughout this recursion, you're returning the heuristic value based on the player currently represented.

00:15:09.380 To summarize the process: we initiate the function call with a depth of two while minimizing the maximum possibilities.

00:15:18.790 In each call designed for the maximizing player, we again determine maximum values and return those values back.

00:15:26.590 The minimizing player will then look at all the maximums and compute the overall minimum. Therefore, we compute the minimized maximum potential worst-case outcomes.

00:15:49.240 At this point, we’ve completed the hard work, having analyzed the paper and derived an understanding of the algorithm. We've walked through its basic implementation based on what we found.

00:16:01.760 So now it's time to translate our understanding into code. This can condense down to about 60 or 100 lines of code.

00:16:08.840 Some lines might look terse for slides, while others are more expanded to convey the message.

00:16:16.500 Don’t assume this presentation reflects the best coding practices; it’s optimized for visual clarity during this talk.

00:16:29.330 I’ll begin by demonstrating the basic logic needed for Mastermind. First, when we start a new game, we initialize our parameters. We need to define a randomly generated correct answer.

00:16:43.350 This will be concealed from the player; we pick a random number between one and six to represent the various colors and concatenate them to form a string, which could look something like '2134'.

00:17:03.779 We also will convert lists of all possible scores and answers, allowing us to reference valid options later.

00:17:11.430 Next, we create the main structure for the Mastermind game. As long as we haven’t made ten guesses and there are guesses left to make, we call the function to make a guess.

00:17:28.580 If the list of possible answers contains our last guess, suggesting it's a valid guess, increment the guess counter.

00:17:41.830 Next, we calculate the score for that guess. If the score equals four black pegs, our guess was correct. We'll print what the answer was and how many guesses it took.

00:17:52.150 It’s worth noting that the code currently doesn’t handle losing, as our algorithm ensures a win within five moves or less.

00:18:03.300 I promise I considered the scenario where I wouldn't guess 'bbbb'; should you implement this yourself, ensure you handle those situations.

00:18:19.980 Another essential part of the function is calculating the score resulting from a given combination of guess and answer.

00:18:27.640 I set up various variables to gather the score, including lists for the pegs that are incorrect in the guess and the ones from the answer.

00:18:46.590 By pairing each peg from the guess and answer, I can compare them. If they match, I score a black peg.

00:19:01.540 If not, I record any mismatches that occur.

00:19:12.370 I then review the list of incorrect pegs to see if any answer pegs match a color in a different spot, granting me a white peg.

00:19:33.740 For example, if my guess was blue in the first slot but the actual blue is in the third slot, I would score a white peg.

00:19:43.450 Finally, I return the calculated score. That encapsulates the game: evaluate the guess, calculate the score, and proceed until the answer is accurate.

00:19:54.550 Before we jump into the specific algorithm from the paper, I want to show you what this interactive game looks like.

00:20:18.270 This function allows guesses directly from the command line. As an example, I'll run the program, which will present the current game state.

00:20:38.300 After making a guess, the program prints how many guesses are left, and provide feedback about score from the last guess.

00:21:02.870 For instance, with one type of guess, I reveal how many of those colors match the colors from the answer, allowing me to revise my guess accordingly.

00:21:18.780 With each guess, I'll receive corresponding scores indicating how many colors match, which allows a progressive approach to deduction.

00:21:33.410 Unfortunately, in instances like these, we may get lucky and immediately find our correct guess. It's about using process of elimination effectively.

00:21:44.010 After a few tries, we can figure out the secret code, and while I may occasionally win in six, the algorithm consistently aims to win in five.

00:21:56.830 So now that we have a grasp of how this works, let’s switch back to the algorithm specified in the paper.

00:22:13.040 Before implementing the minimax algorithm, we'll initialize relevant structures, as building them up in advance will be more efficient.

00:22:24.110 Start by constructing a list of all possible answers, derived from the Cartesian product of the numbers 1 to 6 repeated four times.

00:22:37.730 This encompasses everything from '1 1 1 1' to '6 6 6 6'. Then, we'll establish a mapping between each possible guess and response to the score that corresponds to its combination.

00:23:05.880 Next, we will create a set from our list of possible answers for faster lookups. This is crucial as we’ll need to execute fast checks later.

00:23:19.950 Finally, let's detail the algorithm itself. We again create the make guess function and evaluate whether this is our first guess.

00:23:30.000 If we’ve already executed prior guesses, we will filter through our initial list of possible answers using the method to eliminate non-matching responses.

00:23:46.920 Every time we receive a valid score from previous guesses, we incrementally include possible answers by matching the scores.

00:24:06.270 The objective here is to minimize the potential guesses, passing only the scores we know align with possible answers.

00:24:19.220 The scores from various answers are grouped together. This way, we limit the number of remaining possibilities to refine accuracy.

00:24:33.400 Whenever computing scores, we’ll categorize each outcome. The aim is to investigate which guess will yield the least number of leftover scores.

00:24:50.660 Ultimately, from those gathered scores, we’ll seek to find their maximum; the worst case becomes our determining factor.

00:25:03.840 Taking this information further, we implement a mechanism to check if the guess we're evaluating is possible or impossible.

00:25:19.780 Again, we’ll prioritize valid guesses and minimize the maximum number of remaining possibilities.

00:25:37.930 Based on our heuristics, we’ll be equipped to navigate the system strategically, identifying possible next moves.

00:25:53.040 To summarize this function: we evaluate possible scores and filter them down based on previous guesses, allowing us to choose the optimal remaining outcomes.

00:26:07.440 After calculating our could-be scores, our mission lies in minimizing theoretical losses while maximizing potential wins.

00:26:28.080 Finally, if it’s our first guess, we will try '1 1 2 2'. Analysis revealed that this is the only guess that guarantees a win within five moves.

00:26:48.740 This brute force method was uncovered through comprehensive exploration of all potential first moves, and we determined this one works best for success.

00:27:06.230 With that understanding, calling the play method will automate the game process. Let’s switch back to the running program.

00:27:19.240 If I run the interactive Mastermind, you can play it manually, but if you run Mastermind, it will build up data structures for every guess and answer.

00:27:36.160 The design ensures speed. After initializing, it will start playing automatically, demonstrating its capability to win consistently within five moves.

00:27:48.940 Occasionally the program will run slow—typographical errors happen, so I may have to fix issues quickly.

00:28:02.160 With complete running and potential bug issues resolved, the code will be accessible for review post-talk, and now I'll accept any questions.

00:28:44.430 You.

RubyConf 2018