Let's Write an Interpreter!

by Ryan Davis

In the video titled "Let's Write an Interpreter!", speaker Ryan Davis delivers a fast-paced presentation on creating a Turing-complete interpreter in just 30 minutes using Ruby, Ruby Parser, and S-expression processing. The session's tone is both humorous and informative, aimed at challenging the audience's understanding while providing practical coding insights.

Key Points:

Expectations Set: Ryan warns attendees that he has 168 slides planned for the next 30 minutes, setting a brisk delivery pace with an expectation of some cognitive overload.
Why Write an Interpreter?: Ryan emphasizes the enjoyment and learning value of writing an interpreter, asking the audience to consider why they might want to undertake such a project outside of academic settings.
Ubi Interpreter: The core of the talk revolves around constructing the Ubi interpreter, which supports branching logic, function definitions, and local state management.
Parsing: He describes using Ruby Parser to generate an Abstract Syntax Tree (AST) for input code, explaining the concept of S-expressions (sexp) briefly, while opting not to delve deep into parsing mechanics, humorously deeming parsing 'boring' compared to semantics.
Runtime and Evaluation: Ryan explains the evaluation process through a test-driven methodology, illustrating how to handle operations like addition and cognitive processes involved in interpreting code.
Adding Complexity: As he progresses, Ryan introduces conditionals, local variables, and function recursion, demonstrating each with relevant code examples and testing practices.
Final Implementation: The culmination of his work leads to a functional interpreter capable of executing basic arithmetic, conditionals, loops, and user-defined functions, encapsulated in concise code.

Conclusions:

Ryan reflects on the small implementation size of the interpreter (approximately 200 lines of code) and suggests potential expansions for further study, such as richer data types and different function calling conventions.
He shares a personal recommendation for foundational programming resources, encouraging the audience to explore texts like "Structure and Interpretation of Computer Programs" and "The Little Schemer".
In closing, he humorously advocates for 'Seattle style' coding conventions, emphasizing clarity and minimalism in syntax throughout the development process.

Overall, this engaging presentation not only illustrates the mechanics of building an interpreter but also highlights the joy inherent in programming and problem-solving.

00:00:20.160 Our last speaker is Ryan Davis. Ryan started doing Ruby back in 2000, and he is the founder of Seattle RB, the first Ruby Brigade, which was established in 2001. Of course, I like him because he used to do Smalltalk and has excellent taste in restaurants. Ryan asked to be the last talk because he wants to be able to break your brain, so you won't be able to use it for anything more for the rest of the night. Stop drinking, except for drinking. So, let's welcome Ryan!

00:00:36.000 Thank you very much. I'm going to be talking today about writing an interpreter. First, I want to say thank you for having me, and thank you all for sticking it out. It's been an incredible conference with massive signal to noise, and even the noise was fantastic. The Rubik's Cube thing was just mind-boggling. My brain certainly hurts, and I hope yours does too because I'm only going to make it worse. Thank you very much to Josh, Leah, Jim, and all the Goku helpers for putting in tons of hours and effort to make this conference work.

00:00:56.000 You guys just don't know how much blood, sweat, and tears goes into something like this. And thanks to all the sponsors for helping to make this possible. Let's set some expectations because I'm a big fan of clear communication. I have 168 slides in 30 minutes; that's about 5.6 slides per minute. There will be loads of content, almost all of it will be code, and I'm hoping to hurt your brain. I'm going really, really fast. Much of what I go over is a pattern that's going to repeat many times, so you don't have to understand the first thing to understand the third. Bear with me; I'll get you through this.

00:01:19.040 I’m really excited about giving this talk, and I think I can give it really fast. So, what I’m going to give my talk on is how to solve a Rubik's Cube in 27 minutes and do an interpreter in three. Not really! So the first obvious question should be: why on earth would you want to write an interpreter if you’re long out of college? It’s because I'm a pervert—now, my minority perversions have little to do with this talk. It's simply because we can.

00:01:30.000 Quick show of hands: how many of you have a computer science background where you took formal language theory, or otherwise wrote interpreters in school? It's about a third of who's left in here, so this talk is not necessarily for you. However, I may be introducing tools that you're unfamiliar with. We’re going to be writing the Ubi interpreter, which will be Y combinator compatible, or at least that’s the theory. It will allow for branching logic, while loops, local state, and functions. This is the most complex piece of code that I want you to be able to understand. Anything longer than this, you can either gloss over, or I’m going to be going into it in detail.

00:02:00.000 Specifically, this has local variables, what are called primitive function calls (stuff that’s built into the language itself), and then user function calls that can be recursive. This is the roadmap for what we’re going to cover: we’re going to be going over really basic parsing and creating a runtime environment. On top of that, we’ll build those three layers—loops, variables, and conditionals—and, on top of that, functions for our five-layer burrito.

00:02:35.440 So, really quickly about parsing: I use Ruby Parser to do all my parsing for Ubi because it is a subset of Ruby to begin with. That is the bottom layer of our burrito. Ruby Parser takes source like this and conceptually outputs something that looks like this, which is an abstract syntax tree. It is confusing at first, but eventually, you’ll see how the components are structured and how they relate to each other.

00:03:05.840 We'll pull apart clusters of subtrees one piece at a time until we can start to see how they're structured. Eventually, you will see something like this: we've got one, two, three, four, five boxes—with the if condition, the true case, the false case, and the false case which has the sum of two recursive calls in it. The code that represents this looks just like this: it’s got one, two, three, four, five boxes—the if, the true, the false, and the sum of two recursive calls that map one-to-one between the boxes and the code.

00:03:35.440 Let’s do some vocabulary real quick. This is an S-expression, henceforth known as ‘sexp.’ They have a head (which, in our case, is always going to be the node type of the thing we’ve parsed) and it has a rest, which can contain sub-sexps. For the whispers out there, that’s the car and the cdr, but we’re not going to worry about that too much. We get all of this for free using Ruby Parser, so I’m not going to talk about parsing at all anymore. Parsing is hard; it’s its own talk in and of itself, and honestly, parsing is boring when compared to semantics.

00:04:08.640 So, if you want to study parsers further, Michael Jackson gave a fantastic talk titled "Parsing Expressions in Ruby" at Mountain West 2011. It is available on Confreaks, and it’s a fantastic half-hour to spend. Now, let’s take it up one step further and go into the runtime. How do you get seven from three plus four? The first thing is, you start with a source, parse it into an AST, and then process everything by type, starting with the inner values first.

00:04:43.440 Now, this is how most languages do it; it’s called applicative order evaluation. It doesn’t have to be that way—some languages choose not to do it that way, and it creates a lot of flexibility. You can defer the evaluation of the arguments until later and get a lot of power out of that—how you do it in your language is entirely up to you. We’re going to do it with our values first. So we interpret lit three into three, lit four into four, and then we move on to the outer node with our process call.

00:05:09.760 We have a receiver of three, a message of plus, and a single argument this time of four. And we might pass that to a proc that we built for dealing with addition. When passed two args, we would get the value seven. Again, this is just one way to implement it. For the sake of time, I'm going to use Ruby itself to deal with a lot of the semantic internal workings of the language. This is sometimes referred to as a meta-circular interpreter.

00:05:36.320 The S-expression processor dispatches to process underscore node type methods. So we would have lit three automatically dispatch to process lit, pass in this sexp. We would grab the last item from that and we would get the value of three. We’re also going to be doing this entirely test-driven. That adds an extra rainbow which does show up and spans the entire implementation of this.

00:06:05.600 So our unit tests might look like this: we would have an interpreter, we would call a val with three plus four, and we would expect the assertion to equal seven. For the sake of time, I’m going to do some premature refactoring. I’m going to write and assert a val that does exactly that, allowing us to simply assert a val of seven from three plus four, and it’ll do all the internals.

00:06:33.760 Let’s move on to our sanity test—we’re in the runtime. We implement some test infrastructure, subclass many test four, and instantiate the Ubi interpreter, assigning it to an accessed ivar and put in our last assertion. Our first test is going to look like this: the new code is going to be bold, and the old code is going to be omitted so that you can focus on it. To start, we’re going to interpret just the atom three, and then we’re going to move on to three plus four.

00:06:59.680 This is a standard sanity test that goes all the way back to Smalltalk days. If I remember the story right, it’s because three has a primitive op code and four does not. So, we write that, run our test, and get our first error. It’s simply that we haven’t implemented anything. We go ahead and implement. We’re going to be subclassing Sexp interpreter and calling it the Ubi interpreter, and we’re going to put in a version constant so that my rake tasks shut up and move on from that.

00:07:25.840 The next failure we get on run is an unfortunately slightly confusing error. It’s because we called the method val. The val is defined in Kernel, so instead of simply getting an error that says, 'I don’t know what the hell you’re talking about,' we get an error saying, 'You’re trying to call a private method of val.' We don’t want that, so we go ahead and implement an initialize method where we instantiate a Ruby parser, define a val, which parses the text first and then passes that off to the process method, inherited from Sexp interpreter. That method knows how to dispatch based on node type, and we gain our power from that.

00:08:04.000 Finally, we get the error that I want you to see: 'Unknown node type lit to Ruby interpreter.' That’s because we do not have a process lit defined. This is the type of error we’re going to see again and again. So we go ahead and define that as we previously discussed: we simply grab the last item out of it, and we move on to the second assertion where we have an unknown node type call.

00:08:39.520 We go ahead and define process call. Because it’s more complex, I’m going to take a step back and go a more roundabout way of defining it. I’m going to make it raise with the arguments that it's passed so that we can see what’s going on. We can see that we raise and call and this is color-coded just to help you connect the dots to the next slide. We’ve got the call node type, the receiver of three, the message of plus, and the single argument of four.

00:09:10.240 So we go ahead and unpack that using multiple assignment non-destructively. We grab, we ignore the node type, we grab the receiver, the message, and we splat out the args because it could be multiple. We’re going to go ahead and leave the raise in there so that we can see that we unpacked it properly.

00:09:39.040 Our failure now looks like this where we can see the three pieces. So we go ahead and process it. Well, all we’re going to do is recurse: we’re going to keep calling process all the way down until we’re done with the bottom nodes of our tree and then we’re going to boil back up. We call process on the receiver and assign it back to the receiver because we don’t care about the old value anymore. We’re going to map bang the args over, calling process on those. We’re going to leave the raise in there, and we should be able to see that indeed we did it correctly, so we have three plus four.

00:10:16.560 At this stage, I’m going to cheat: replace the raise with send on the receiver, given the message and the args that we have. I’m going to mark that as a big old hack because we’re going to revisit that in a bit. Finally, we have 19 minutes to go—that’s not bad! So what have we done? We’ve defined parse, eval, process lit, and process call.

00:10:46.560 Basically, at this stage, we have an overblown calculator. Let’s give it some more meat; we’re going to add conditionals and truthiness. I’m going to borrow from Ruby and define truthiness the same way. We’re going to have nil and false be our falsey values, and everything else will be truthy. That means that the implementation of the handling of boolean values is a little cleaner; otherwise, we just have to put some extra wrappers around that.

00:11:16.560 So we go ahead and add some if tests. We define test if and test if falsie, given the code: `if true then 42 else 24` and we expect 42, and given the nil on false variance, we expect 24. The first thing that happens is what we're expecting already: unknown type if. We go through our usual pattern of defining a generic handler for it.

00:11:58.560 We look at the components and see the node type if, the condition code, the true code, and the false code. We go ahead and unpack that into ctnf, we process the condition, and then based on the condition, we process either the true side or the false side. Now, why evaluate the condition before the true and false? After all, in the call we evaluated all the args for a call.

00:12:36.000 If you evaluate code like this and did both sides of the condition, you might be a little unhappy. The next thing that happens is not that big of a deal. We get the sub-sexprs fail on true and nil; they’re very straightforward. I’m going to go ahead and use the Ruby values of them and we get a third one for the second assertion in the falsey side. Define it as well, and voila—more dots!

00:13:05.280 This is what the test-driven process for writing an interpreter looks like, and indeed, this is exactly how I wrote this code. I decided what I wanted to do for a talk, sat down, and did this test-driven step by step, then produced these slides from that. So we had a failing test, we get an error on a node type, we add a generic node type handler, we get a failure on the sub-sexprs components, we process the sub-sexprs and our tests pass. The development really is this fast.

00:13:35.920 I probably got this many dots in about the same amount of time for the first dot. I do this personally with autotest—you might use guard or something else. When I'm coding, I have Emacs fullscreen on my left, either my implementation or my test, and I’ve got a hotkey that toggles between those intelligently based on file naming. On my right, I’ve got my autotest pane, which intelligently reruns the tests every time I save.

00:14:10.080 As I have failures, it reruns just the failures until they whittle down, and then it reruns a full pass to make sure I didn’t break anything else. Your brain hurt yet? A little bit? Good! Because I haven't even gotten warmed up. Let’s add local variables.

00:14:36.320 So now we’re going to do state; state is a little different. We’re going to start off with a simple test that assigns 42 to x, accesses it again, and expects the value of 42. But if you notice that little semicolon, there’s a hint that something weird is happening. We get an 'unknown node type block' and that’s because we had two statements in there.

00:15:06.560 Because everything is tree-based, you need some sort of container to be able to handle multiple statements, and that’s the block node. So we go ahead and define process block with a generic handler. We look at its pieces and we can see, 'Ah, okay, well, there’s an lsi node and an elvar node in there,' and they’re just wrapped up in this block. We’re going to do the same thing that Ruby does.

00:15:38.760 We’re going to define a result variable, walk over the sub-exprs, and process each of them, assigning each one to the result. Whoever's the last one wins as the result of the expression of the block. Next thing we move on to is the first sub-sexp: lassign. We define it generically, we look at its pieces, we see the lsi node type, we see the name x as a plain symbol, and then we see whatever value it’s supposed to be.

00:16:09.840 So, we need some place to put this. I’m going to add a simple table. I know this upfront that it’s going to be bad, so we define m. We put a plain hash in there, we mark it as horrible, and throw our hands up and walk away. Then, we define process lassign; we get the name and the value, process the value, and store it in that name in that hash. And it looks something like that—only a boxier version.

00:16:40.320 So the next thing that happens is we get an unknown node type elvar—that's the reader. Um, that's really simple; we just put it in there, so we’re just going to pull it back out. We get the name, we pull it out, and we've got more dots. Now we’re getting to the meaty stuff: functions.

00:17:10.000 We’re at the top of the burrito! We’re going to have the defin test, and for the sake of argument, I’m going to say the result of a def is nil. That’s arbitrary; I could have done the name, I could have done all sorts of stuff, but in this case, it’s going to be nil. So, we’re going to define double of n to be two times n, and then we’re going to call it with 21 and expect 42.

00:17:45.120 First thing that happens is exactly what we expect. So we define a generic implementation; we look at its components. We see the defin subtype, node type, the name double, an args node of n (which we're not going to process further), and then any number of expressions afterwards for the body. Now, because this is a function and we’re defining it, we’re not evaluating it at this time, so we’re going to take the args and the body that we’ve extracted.

00:18:23.680 We’re going to store it off in the m f along with everything else, and that’s going to look like this. Does that work? That does not work, and I’m not going to try. Where you have an array being stored instead of a regular array—uh, a Ruby value— we’re going to start off an array with a tuple of the args and the code to be executed.

00:18:36.400 So, the next thing that happens is we move on to the next line where we call double and we get this strange error. This is not like what we’ve seen before, so let’s look at it. It’s 'undefined method double for nil.' If we look at the stack trace, we can see that we're in process call and we're on that send line that I marked as a big old hack.

00:19:08.480 So we look at that, and we say there is no receiver because we're just calling double; the receiver is implied. So it's nil in our case. We only want to use send for primitive methods that Ruby's going to define for us, like anything mathematical or anything binary. So in this case, we’re going to use that fact and isolate based on the receiver being nil or not.

00:19:34.480 We say if receiver, then go ahead and do the send; otherwise, we're going to raise. We rerun the test and we see that we have isolated that case. So now we've got a place to put the implementation for our user-defined functions, because they are more complex.

00:20:03.120 When we’re calling a user method, we have to set up our environment for any local variables that that method defined, give them their values, and then we need to execute the code by calling process block and recycling that code so that the last value in that executed function is the result of the function.

00:20:33.920 I know you’re probably thinking args, deck, zip. So let’s step through this one step at a time. The first thing we do is take the names of the arguments out of that array and zip them against the processed values for the call itself. So we have decals.rest zip args. Each do key in value, and then we assign stuff into the environment.

00:21:05.680 Now, in our case, that’s the array of the name n zipped against the value of 21. We’re going to walk into the environment and assign 21 to the name n. You may have noticed that n was already in the environment; we just assigned 21 to it. What happened to that value? Well, it got us dots for this test.

00:21:36.400 But, uh, my favorite thing that I've ever been told by Kent Beck is 'just getting green' means you don’t have enough tests yet. To answer your question of when do you know you have enough tests: it’s a gut feeling. Not the best answer I've gotten from Kent Beck.

00:22:06.160 So let's move on to something more complex—we're going to buffer this with Fibonacci. That's going to touch a lot of places, and specifically because it has recursion, we need to ensure that we're treating our environment right. So we go ahead and define and assert that nil with the definition of Fibonacci. We call it with fib six and expect eight. It runs. We get the wrong value; this is exactly what we expect because we know we’re trashing that table.

00:22:35.680 That stupid table. Let’s say we have the value 8 in there, and we go ahead and call fib six. So, fib defines n as well; that means that we wrote six in it, and eight is now gone. We recurse and call fib four, trashing again, and so on. What we really need is something more intelligent as a data structure to be able to store things as we go deeper into our scope.

00:23:05.920 What we want is something like a stacked hash. We need something that is the way it is on the left, but looks the way it is on the right so that as we go into scopes, we get this nice flat hash with the current local value for any given name. We’re going to call that environment, and it’s going to have the ability to add a new scope, which just puts a new hash on top.

00:23:36.560 But when you access it, it’s going to flatten and get the value out. That implementation looks like this: you do not need to understand this code. What you do need to know is that when we do the getter, it calls all so that it reads from everything. When you do the write, it calls last and writes into the last (the most recent scope.) All flatten everything such that the newest value wins, and we have a utility method called scope which pushes a new scope on yields and ensures that it gets popped back out.

00:24:07.920 We go ahead and replace that stupid hash with a new environment. We add the scoping call around the call to the user function, and finally, we have dots for Fibonacci. We know that we’ve actually backtracked and fixed the other test case as well. So just to make this Turing complete, we’re going to add while loops.

00:24:43.600 This section was something I added last minute. It took me about ten minutes to implement and 30 minutes to do the slides. The fact that the slides are harder than the implementation really says something about the way this process works. So, let’s refactor Fibonacci in the test; we’re going to add it, pull it out, and put it into defined fib.

00:25:16.320 That lets us reuse it in this new test where we calculate the sum of the fibs from one to ten. As you can see here, we're going to be using local variables, primitive calls, recursive user calls, and heavily using the environment across all those things. This is going to be our sanity stress test across this implementation.

00:25:49.520 First thing is exactly what we expect: unknown node type while. We do our usual, we go ahead and define a generic implementation. We look at the components, get the while node type, we get the condition code, and then we get this little confusing part: we get the body code in red, and then this artifact true at the end, which is something that comes from Ruby Parser, but we’re going to go ahead and ignore it.

00:26:20.960 What it is, is the pre or post condition flag—whether or not in Ruby you say while blah do something versus do something while blah. We don’t care about it, so what we do is just pop it off as soon as we splat out the body. We process; we use the Ruby while to process the condition and then the Ubi’s while condition code.

00:26:56.560 Then we use the process block to keep calling the body code as long as process condition is truthy. For the first time ever, we get dots on our first test! So, what have we done? We’ve implemented val and parse; we’ve processed blocks, calls, defs, false, if, else, and elvar, nil, true, and while.

00:27:21.280 We’ve defined a stacked environment class for nested scopes. We’ve implemented tests for all of this; we’ve called it the Ubi language. It has basic numeric types, true, false, and nil. It supports conditional branching and looping. It supports primitive and user-defined functions, including recursion. It has local variables and variable scoping; it’s test-driven, extensible, and pattern-based.

00:27:57.360 Most importantly to me is that it took just two hours—it’s about 130 lines of implementation, about 70 lines of tests—so really just about 200 lines of code that fits in one head, and that’s really important for me. So what else can we do with this? We can add richer types, we can add strings, arrays, hashes, any other data structure you want.

00:28:20.560 We can enforce different scoping rules, like Ruby has an opaque def where there are no local variables available from outside in, or we can go with Scheme’s transparent lambdas. We could implement recursive tail calls, or we could implement closures. I was trying to do that on the airplane on the way here, and I definitely can do it; I just can’t fit it into this time slot.

00:28:49.760 We can add an object system to it or we can leave it completely functional. We can change the way functions are called. Like I said, we could make it so that the applicative order evaluation is not done in this language, or whatever we want for further study.

00:29:20.320 I’ve really enjoyed working through Structure and Interpretation of Computer Programs (SICP). Tender Love, Aja, and I wound up working through it. I wish we had known that MIT didn't actually assign every exercise as their homework, but we did. Afterwards, you can say, 'I know SICP.' I highly recommend—more than SICP—the Little Schemer and the Seasoned Schemer. They are fantastic books.

00:29:51.680 For no other reason than the fact that they teach from first principles using Socratic dialogue. It’s mind-boggling; it’s so good. There’s a follow-up book called the Reasoned Schemer that implements a Prolog-like inference system, and I got three minutes on top of Scheme and it’s really mind-bending how Prolog works once you finally work through it.

00:30:27.840 There are no time for questions, so please grab me at the after party. Not one more thing. Since this is the last talk, I’m looking at you, Josh. I get to correct an injustice from Twitter: 'He who made it defines it.' So we’re going to go over what exactly is Seattle style.

00:30:30.880 Seattle style is warning-free. We use -W always, and we shoe libraries that poop on our test output. It has no extra syntax—neither on def lines, nor on calls, nor anywhere. We don’t do this; this is not Seattle style. Paren, paren, paren, paren, paren, paren—this is Seattle style.

00:31:01.440 And you will notice that we do use parens anywhere that -W lets us, because the forward slash could be confused with division. You need parens to say, 'No, no, this is an argument,' and we’re going to have regex inside that is everything. Thank you very much.