Pivorak Conf 2.0

Ruby Us Hagrid: Writing Harry Potter With Ruby

We all know that Ruby can give us superpowers, but can we use it do something truly magical - write a brand new Harry Potter completely automatically?
It turns out that Ruby and the dark arts of Natural Language Programming are a match made in heaven! Using some basic NLP techniques, a dash of probability, and a few lines of simple Ruby code, we can create a virtual author capable of generating a very convincing Potter pastiche. And if the life of an author’s not for you, don’t worry. In the last part of the talk, we'll explore how we can apply what we've learned to everyday coding problems.

Pivorak Conf 2.0

00:00:08.230 All right, okay, cool. Hi everyone! Yes, so as Anna said, the idea behind this talk is quite a crazy one when you first hear it. Can we use just regular old Ruby to write a brand new Harry Potter book completely automatically? When you hear this idea, maybe some questions immediately spring to mind. First of all, you might wonder, why would we want to do that? You might think, what would that actually look like if we wrote a program to generate a Harry Potter book in Ruby? What would the end result be? Then, the big question—and where we'll spend most of our time today—is how on earth we actually do that.
00:00:40.390 Okay, so first of all, on the question of why we would want to do this: there are probably two different kinds of people in the audience right now. First, there are people like me who love Harry Potter. Could I see a show of hands? Ah, I think more than half of you! For us, I think answering this question is really easy; we can just imagine a nice big beautiful pile of brand-new Harry Potter books. We could stay in the Wizarding World forever, which is our motivation. Now, hands up if you don’t like Harry Potter. Who’s willing to admit to it? Okay, whew! For those of you who don’t like Harry Potter, what you need to visualize instead is a nice big pile of money, because that’s, of course, what you’ll get if you can find a way to please people like me who love Harry Potter and will pay anything for more of it.
00:01:09.399 Here’s a test, by the way. If you’re undecided about which category you fit into, look at this picture and base your reaction on that. That’ll sort you into the right category. So that's the 'why' aspect of why we might want to do this, but on a more serious note, there are some other reasons. One reason is that this will introduce us to some concepts around natural language processing—using machines to work with language—which is a rapidly developing field. I think it will also demonstrate a lot of the power of Ruby, and when we see the finished version of this program, we’ll realize how simple Ruby code can accomplish some pretty amazing tasks.
00:01:40.390 So what does this actually look like? I’m going to show you the final story that we’re going to generate as we build up this Ruby program. Here’s the spoiler alert for the end of the talk: this is what we’re going to end up with. (Reads aloud) 'Neville, Shamus, and Dean were muttering but did not speak when Harry had told Fudge, me weeks ago, that Malfoy was crying—actually crying, tears streaming down the sides of their heads. They revealed a spell to make your bludger,' said Harry, anger rising once more.' Now, it’s not perfect by any means, but it kind of has the feel of a Harry Potter story. It more or less makes sense, it’s more or less sensible English, and hopefully, when you see how little code it takes to generate a story like this, you’ll be impressed with what we can do with just a bit of effort.
00:02:59.730 So now, the big question is: how do we use Ruby to generate Harry Potter stories or any kind of language? This isn’t immediately obvious. I mean, if we take just one sentence from that story, how do I write a Ruby program to create a new sentence like this? Where do I start? It can be a bit intimidating at first. The first key idea, which sounds really obvious but is very important, is that when we're telling stories, we just focus on one word at a time. This is true anytime we generate language; usually, we focus on one word at a time. The second key idea is that we all have a great source of inspiration in our pockets right now—our smartphones. Our phones are capable of doing a lot of this already.
00:04:05.430 What do I mean by this? Well, many of you are probably familiar with the autocomplete feature. We usually use it to speed up our typing as we write, but there are some interesting things about autocomplete that we can leverage. For example, I’m just pressing the middle button of my phone and not doing anything else, and interestingly, it starts to generate a sentence without me doing anything. Now, the other interesting thing about this is that if you all try this on your phones—assuming your phones aren’t all in Ukrainian, which is likely—you will end up with different results. Your phones learn your style of speaking. They know what words you use and tailor suggestions based on your conversation style.
00:05:56.340 So how does my phone know how to predict the way I talk or imitate it? Well, somewhere in the phone’s memory, it keeps track of the words I use and what words I use after those words. For instance, it knows that after 'birthday,' I usually mention 'party' thirty times or 'cake' twenty times. It can then rank suggestions based on this memory. Now, what's interesting is we can use a similar idea for any kind of language. If we take the Harry Potter books and analyze words like 'golden,' we can look at what words appear after 'golden' in those books. For example, 'golden egg' is the most common phrase that follows, appearing thirteen times, while 'Golden Snitch' (a Quidditch ball) appears eleven times.
00:06:49.280 Now, if we created an imaginary Harry Potter phone and typed in the word 'golden,' our suggestions would likely be 'egg' and 'snitch,' and then perhaps 'plates,' and so on. A couple of terms I’ll continue using are 'head word' for the first word of these suggestions and 'continuations' for the following words. This big idea allows us to start generating stories which we need to accomplish in two steps. First, we need to get our program to learn how J.K. Rowling used language. Once we've learned her style, we can generate more language in that same style.
00:08:06.060 Basically, we need to replicate what I just showed with the word 'golden' for every word in the Harry Potter series. In total, there are 22,000 unique words used in the Harry Potter books. Now, looking at this slide, we can consider how to translate this idea into Ruby. It can be stored very nicely in a Ruby hash. For every individual word like 'golden,' we have another hash, and each entry in that inner hash contains the potential continuations and how often they appear. This memory will allow us to learn in our Ruby program.
00:09:01.759 Now, how do we write a program that generates this memory? The first thing we’ll need is the data—the Harry Potter books in some text format. There are various places to get this. If you want to try this yourself, I’ve included links where you can download these text files. We'll start off by loading the text file and do a bit of data cleaning to prepare it. First, we want to perform something called tokenization, where we eliminate punctuation and capitalization, simplifying everything to make our task easier.
00:10:17.069 Once we’ve completed tokenization, building our memory is straightforward and can be done with just a few lines of Ruby code. I’ll break down what’s going on here and don’t worry about remembering it; I’ll provide a link at the end with all the code if you want to check it out. We’re going to use the 'each_cons' method built into Ruby, which will take each consecutive pair of words in our text. We’ll start with an empty hash to collect our statistics and will add a new entry for this combination of head word and subsequent word, then increment the count for how many times we’ve seen that pair of words appear.
00:11:58.780 This will allow us to keep track of every head word and how often each continuation appears. We can let this run on the entire Harry Potter text file, and we’ll be done! We will have learned the style of the Harry Potter books. So how do we use this to generate a new Harry Potter story? There are several different approaches we can take, but I’ll start with the simplest: the greedy algorithm. The greedy algorithm always picks the most likely continuation without any self-control.
00:12:49.510 For example, if we find that the word 'golden' is mostly followed by 'egg,' this algorithm will always choose 'egg' after 'golden.' Implementing this in Ruby is straightforward. We’ll take our stats, which is a nested hash with our head words, continuations, and counts, and we’ll select the continuation with the highest count. We will iterate this process to build out a complete story. However, there’s one tricky thing to note—while using the previous word to predict the next word in our story is great, we need something different for the very first word. We can just select any word from the 20,000 unique words used in Harry Potter.
00:14:20.390 Let’s put all of these pieces together and generate our first story. To recap, we’re going to pick a random word to start, then generate a 50-word story by continually picking the most likely next word based on our collected statistics, and finally, we’ll put it all together into a story. So, does this work? I ran the greedy algorithm and this is the first story I got: 'Oh no,' said Harry, '... and the door and the door and the door...' followed by more repetitions. This doesn’t seem great. Hopefully, my initial word choice was just unlucky. So, I tried again, and it gave me: 'Surreptitiously several of the door and the door and the tour...' This isn't working great either!
00:15:49.760 What’s happening here? Basically, we keep getting stuck in a loop. After every word, based on the greedy algorithm, we always pick the most likely next word, leading us to phrases that repeat indefinitely. The most I can get is a 20-word story before falling into this loop. So, we can’t use the greedy algorithm; we need a different approach. Another possibility is the completely random approach, which we call the uniform random algorithm. With this, when we have a head word and its potential continuations, we will randomly select one without considering frequency, just picking one at random.
00:16:50.890 The first time I ran this algorithm, the story generated was: 'Debris from boys or accompany him bodily from Ron,' followed by some nonsensical phrases. It’s definitely a slight improvement over the previous attempt, but it’s still far from sounding like a Harry Potter story. The problem with the random algorithm is it doesn’t imitate the style we’re targeting. For instance, when looking at the word 'house,' 'elf' appears often, whereas 'prices' only shows up once, but a uniformly random selection treats both options equally, leading to poor imitation of the text's tone.
00:18:30.060 Instead, we want some element of randomness, but we also want it to be smarter. For example, rather than fully random, we can weigh our words based on their occurrence frequency. If 'house' occurs 700 times, and 'elf' follows 'house' 100 times, while 'prices' follows only once, 'elf' should be much more likely than 'prices.' So, we can implement a raffle-like system, where each possible next word has entries proportional to its frequency. Although slightly more complicated in Ruby, we can execute this by adjusting our algorithm to use a weighted sample method where the frequency influences selection.
00:19:53.390 When we did this, I got a more convincing story: 'Springing forward as though they had a bite of the Hippogriff, he staggered blindly, retorting Harry, some pumpkin tart.' It’s starting to resemble a Harry Potter story. However, we can take it one step further. Going back to our autocomplete example, we find that if we type a phrase like 'fish,' we get much more relevant suggestions. This insight tells us our phones look at more than just the last word they typed, and we can do the same. Instead of only looking at the last word, we can consider the last two or three words to improve our results. This method of utilizing previous context is known as a bigram or trigram model.
00:21:37.220 Switching to a bigram model allows us to consider every two-word phrase in the Harry Potter books. So, instead of having keys that are individual words, we now have keys that consist of arrays of two elements representing two-word phrases. Although this approach requires more processing time and memory, as the corpus has significantly increased to 300,000 unique phrases instead of 20,000 individual words, the power of modern programming makes this feasible. Moreover, changing our existing code to support this is very simple. We can leverage the splat operator so our head can represent a variable number of elements, specifying how many words to consider.
00:23:33.580 As we increase the number of words we’re processing, the quality of our stories improves. This is particularly true if we use the trigram model instead of the bigram one. Ultimately, the final program we’ll be using can generate a brand new Harry Potter story with only about ten or twelve lines of Ruby code. None of those lines are overly complicated. Thinking back on this experience, some people who have heard this talk before have asked if it’s still relevant if they aren’t interested in Harry Potter or finding creative uses of Ruby. This leads me to consider how developers tackle hard problems.
00:25:30.140 Initially, writing a Harry Potter story with Ruby seemed like a daunting task, but breaking it down led to realizations that made it quite manageable. Whether you’re a new or experienced programmer, addressing hard problems effectively can be challenging. For newer developers, the first tough problem can feel overwhelming, while experienced programmers may find it difficult to articulate their thought process. Reflecting on this, I’ve identified three key insights from tackling hard problems. First, breaking down complex issues is essential; determine what constitutes 'one bite' of the task. Secondly, examining failures closely often yields valuable insights—seek to understand why something did not work to inform future attempts.
00:27:38.230 Lastly, finding a good metaphor for a problem can be very beneficial. For example, picture building a timetable system for Hogwarts where no two classes a student is enrolled in occur at the same time. Solving this challenge directly can be tricky; however, thinking of it as a graph coloring problem, where dots represent classes and lines denote student overlaps, makes it more approachable. If you can draw a connection to your problem that allows exploration—something you can play around with—this can provide new perspectives on finding solutions. Always ask yourself, 'What is a relatable metaphor for this problem?'
00:29:53.070 That’s my recap of tackling hard problems and some key insights that emerged from this talk. Additionally, I have made slides and code available online. Links to download Harry Potter books in text format are also provided if you'd like to experiment with this yourself. Thank you very much for your attention!
00:31:52.470 Now we have time for a couple of questions. Please raise your hands, say your name, and then ask your question. Some audience members take photos, and yes, you will get copies of the slides on YouTube. The Ruby community is quite large and enthusiastic about Test-Driven Development (TDD). Have you considered how to test the quality of the generated results? That’s a really good question! On the one hand, while the benefits of using tests in this kind of NLP application in Ruby cannot be overstated, establishing automated tests to measure quality is complex. In the industry, we often rely on metrics like mean sentiment score or simply ask users for subjective evaluations of the results. Nevertheless, various components can and should be tested to ensure integrity and performance.