Vertical Assignment in Ruby

Ruby

Kevin Kuchta

#developer-experience-dx

Vertical Assignment in Ruby

by Kevin Kuchta

The video titled "Vertical Assignment in Ruby" by Kevin Kuchta at RubyConf 2021 explores the concept of vertical assignment in Ruby, a feature that has not yet been included in the language despite the existence of leftward and recently added rightward assignment. During this talk, Kuchta humorously argues for the need for a vertical assignment operator, dubbed 'vequals', and demonstrates a metaprogramming approach to implement it using Ruby's tools like TracePoint and Ripper.

Key points discussed in the video include:
- Overview of Assignment in Ruby: Ruby originally supported leftward assignment and has introduced rightward assignment. However, the absence of vertical assignment was identified as a significant oversight.
- Introducing the 'vequals' operator: The speaker humorously proposes the need for 'vertical assignment' and starts experimenting with various ways to create this operator.
- Using Ruby's Metaprogramming Tools: Kuchta showcases how to utilize Ruby's metaprogramming features, such as defining methods with unusual Unicode characters to act as operators, thereby pushing the boundaries of typical coding practices.
- TracePoint for Dynamic Behavior: He explains the use of TracePoint to run code on every line of Ruby code executed, allowing for actions just before any line is evaluated. This was key for implementing the vequals operator functionality.
- Implementation Details: Through a series of coding examples, he elaborates on how to dynamically define variables and how to interpret lines of code effectively using the Ripper tool to parse Ruby. The process involves emulating a 'just-in-time' variable declaration that automatically assigns values to variables found in the code.
- Hilarious Outcomes and Risks: The speaker emphasizes the inherent risks and potential confusion resulting from such unconventional programming techniques, targeted mostly for fun rather than production use.
- Encouragement to Experiment: The conclusion of the talk encourages developers to explore Ruby's flexibility and try unconventional methods to learn more deeply about the language. Rather than producing practical code for everyday use, the talk advocates for creativity in coding as a valuable habit.

Overall, Kevin Kuchta's presentation is a humorous yet insightful exploration of an outlandish idea to enhance Ruby's assignment capabilities while also illuminating the power of the language's metaprogramming tools. The light-hearted tone coupled with technical depth makes the topic engaging.

00:00:10 As many of you know, Ruby has what's called leftward assignment. That is, the thing on the right gets shoved into the thing on the left. Now, Ruby also has what's called rightward assignment, where the thing on the left gets shoved into the thing on the right. This was a relatively recent addition to the language, around Ruby 2.7. It was later merged into Ruby's pattern matching support. However, when it first came out, a lot of people were understandably upset; it's clear to see why. It makes no sense whatsoever to have leftward and rightward assignment in the same language without also introducing the obvious downward assignment.

00:00:34 As far as I know, the Ruby core team has made no moves to add this obvious feature. In a lesser language, we might be stuck with this heinous oversight. Thankfully, this is Ruby, the language whose unofficial motto is, of course, 'That's possible.' And so, I give you this talk: Vertical Assignment in Ruby.

00:00:47 My name is Kevin Kuchta, and because my parents never taught me right from wrong, I'm going to walk you through implementing the language feature that no one asked for. In fact, several people have begged me not to do this. Thank you for coming!

00:01:08 It's really a good thing that you're all already wearing masks because some of the code in this conference talk is so bad it might be contagious. Please do not try this at home or in your production code bases unless it's funny. In which case, tweet me with the disclaimers. With that out of the way, the first thing you have to figure out when you're trying to create a new awful operator in Ruby is, what is that operator? What goes here?

00:01:29 My first thought is that maybe it could be some sort of rotated equals sign. We might be able to use two vertical bars, which is already the boolean `or` operator in Ruby. However, that's already in use. Perhaps we could override it somehow; after all, we can override the addition operator. Here I am overriding the addition operator on the Integer class so that all integer addition now returns three because three is the best number, and why would you ever need any others?

00:01:54 Unfortunately for us, although probably fortunately for the rest of the world, the `or` operator is not one of the ones we can easily override. But we are not without options. A cool thing about Ruby is that it's very permissive regarding what you can put in identifiers. By identifiers, I mean class names, variable names, method names— that sort of thing.

00:02:13 For example, did you know you could put emojis in identifiers? Here I have defined a method whose name, instead of being three characters, is three tacos, which is obviously strictly better. Now, if you can put emojis somewhere, you can generally put other sorts of Unicode nonsense in there. So all we need to do is find some Unicode nonsense that looks like a rotated equals sign.

00:02:44 After a little bit of Googling, I found this little character: the double vertical line. It looks suspiciously like a rotated equals sign, which is what we want, right? Now, I'm told in real life this character is used to indicate the norm of a matrix. I have no idea what that means, but we are going to use it to indicate our own moral turpitude because we are going to create a functioning vertical assignment operator in Ruby.

00:03:28 Fair enough. If Ruby thinks this is an undefined method, let's define it here. I've defined a globally available method whose name just happens to be the vertical assignment operator. Sorry, I'm jumping ahead of myself. I'm getting tired of calling it the vertical assignment operator; let's call it the vequals operator, short for vertical equals. Because maybe if that's clever and catchy enough, we can convince someone to sneak this horror into Ruby core.

00:04:12 I have defined the vequals method; it takes any number of arguments and does nothing with them. Great! Ruby will hit the second line and say, 'Alright, you're calling a method. The method is one character; you're passing no arguments to it, and it does nothing. Great! Let's move on to the third line.' Then we'll hit the variable, 'x,' and say, 'Well crap, this is not defined either,' and this is a problem for us.

00:04:32 In Ruby, leftward assignment doesn't just assign three to `x`; it also declares `x` as a new variable in the current scope. The same thing occurs with rightward assignment; it creates `x` out of nothing. We need to do the same thing for our vertical assignment. We need to declare `x` somehow. To do that, we are going to use a tool that is wholly inappropriate for the task. If this problem were a twig, we're going to use a chainsaw— it's time for TracePoint.

00:05:02 TracePoint is a really cool tool built into the Ruby standard library. It lets you react to other things happening in your Ruby program. Let me give you an example: Here, I'm setting up a trace point to react to any time an error is raised. I do this by calling `TracePoint.new`, passing in the symbol :raise, and then giving it a block with something I want to do. After enabling this trace point, anytime I call `raise`, this block will execute.

00:05:27 There are other cool things you can do too. In addition to just listening for raise events, you can also listen for any time a method is called. You do this by calling `TracePoint.new` and passing in the symbol :call. Now, anytime you call any method anywhere in Ruby after enabling this trace point, this block will be evaluated. There are several other sorts of things you can have TracePoint listen for besides just method calls and raises. You can have your TracePoint block run every time you start a new class, every time a method returns, or any time a thread finishes. There’s a huge list of options, but if, like us, you're trying to really rip open the fabric of Ruby's space-time, you can run TracePoint on every single line.

00:06:03 You do this by calling `Trace.new`, giving it the symbol :line, and providing some block. Now, if you enable it, every line of Ruby, no matter what, will trigger this line. One important thing to note is that the TracePoint block runs just before the line that triggers it. So, on this first line `x = 3`, just before 3 gets assigned to `x`, it will print out `tracing line 6 in file t.rb`.

00:06:36 The TracePoint block is provided with a variable, which I’ll call `tp`, as a nod to how messy our code will be. `tp` has a lot of information on it. It contains the line number and file path of the line that triggered the TracePoint. There is some additional information it provides, specifically the binding of that line, allowing us to capture the context.

00:06:58 If you've not encountered bindings before, they're these intriguing but nearly magical objects in Ruby that act as a reference or pointer to the scope at a specific point in a Ruby program. The usual way to get bindings, without using TracePoint, is to call the globally available `binding` method, which creates a new binding at that point in the program.

00:07:16 Let me show you how this works: I've got some method with a variable inside it, and it merely creates a new binding and returns it. If I try to print out `x` after this method, itwill say it's undefined because `x` is only defined within that method. However, if I grab a reference to the binding returned from this method, then call `binding.local_variable_get`, it will actually give it to me, providing access to this `x` variable. I find this to be extremely powerful; it lets us throw aside the usual rules of scope visibility and access variables that would otherwise be totally inaccessible.

00:07:44 So, let's return to our vequals nonsense. We're up to the point where the first two lines work, but `x` is still undefined. What if we had a way to define it just before Ruby hits this undefined variable? Like a train building the track in front of it, so it never quite crashes? We can implement just-in-time variable declaration.

00:08:03 Think about it: We already have a way to run code before every line of Ruby and to manipulate the scope of each of those lines. This could look something like this: We can register a TracePoint to run on every single line, and when it does, if that line includes `x`, we're going to define `x` in the local binding.

00:08:27 Unfortunately, there are a couple of problems with this approach beyond the obvious ones. The first real problem is that the `tp.line` method does not exist; I made that up. Although TracePoint provides a lot of useful information about each line that it triggers, it does not give us the raw source code of that line. However, we have everything we need. We have the line number of the line that triggered the TracePoint, the file path, and most importantly, no reservations against doing terrible things in Ruby.

00:09:02 So, to that end, we're going to read the file from disk on every single line of Ruby and extract the line we need. Why not? I can think of nothing wrong with this, aside from the fact that it's incredibly slow and it completely breaks REPLs. But we're going to charge ahead because it works. It solves our problem—at least in the loosest sense of the word.

00:09:26 I did mention that there were other problems. The second problem is with the `local_variable_set` method, which does exist on the binding object. It works as you might expect; if you have a variable `x` defined in a certain scope, you create a new binding in that scope and call that binding's `local_variable_set` method to update that value dynamically. That's pretty neat! However, it does not allow you to create a new variable within a binding.

00:09:49 If I had to guess, this restriction was meant to prevent exactly the kinds of shenanigans we are trying to pull. Imagine if anyone with a handle on a binding could just insert new variables willy-nilly; it would be chaos! Although, to be clear, we are going to do just that regardless.

00:10:12 Now, I'd love to tell you that I know a secret method to declare variables within a binding, but I don't. So, I'll take the coward's way out and define a method that hardcodes the value we want as its return. And look at this line at the bottom: `puts x`. It certainly looks like `x` is a variable, right? And hey, I can still totally assign a value to `x` and it will keep appearing as a variable. So, as long as you all don't tell anyone it's a method and I don't tell anyone the method, we can continue pretending it's a variable with everyone else none the wiser.

00:10:45 So, we have two fixes: we know how to read lines of source code inside the TracePoint, and we know how to declare new variables (wink wink, nudge nudge) within a binding. Yes, both are terrible hacks, and no, we will not stop there. Now we have two wrongs; let's see if we can make a right.

00:11:06 This works! This is a success! On each line of Ruby, we check if that line includes `x`, and if it does, we define `x`. It's just-in-time variable declaration! Let's try to make it a little bit more generic. Here, I'm taking each line, splitting it apart, and looking for anything that resembles a variable. If I find something that looks like a variable, I define it. This works: true just-in-time variable declaration. We need never again fear the tyranny of undefined variables or method missing errors. They're a thing of the past!

00:11:29 However, you can probably already see some limitations, and not just that the basic idea of automatically defining all undefined variables is ludicrous on the face of it. The part that really stands out to me is that this line-splitting process where I just break apart the line on spaces is incredibly brittle. There are a billion different kinds of Ruby lines that would totally break that, such as a line that includes a pound sign.

00:11:53 This will split that line apart and produce numerous elements, trying to define them as variables; most of them are not variables. Yes, I could teach the code to ignore anything after the pound sign or keywords or to account for parentheses and brackets, but that starts to feel like a lot of work.

00:12:13 And anyone who knows me will tell you that I am extremely lazy. My goal in life is to do the least amount of effort for the worst possible outcome with Ruby. What we really want here is some way to take a line of Ruby code, explode it apart, and identify which parts are identifiers, comments, literals, and so on.

00:12:22 Luckily, there are tools for precisely that. Let me introduce you to Ripper. No, not the Victorian-era murderer, although we will be using it to commit crimes against code.

00:12:44 Ripper is quite a nifty tool built into the Ruby standard library, designed for parsing Ruby scripts into data structures. The typical way to use this is by calling `Ripper.sx`, which will produce s-expressions. An s-expression is a fancy computer-sciencey term for a nested data structure representing the syntax of our code. If I pass a piece of code to `sx`, I get this blob of data back. It’s pretty dense, and there's no need to look at it intricately, but you can see that it resembles what we put in.

00:13:09 However, this isn't quite useful for our purposes. The issue with `Ripper.sx` is that it expects a complete program. By that, I mean one where every 'def' has an end, every open parenthesis has a closed parenthesis. If you don't give it that, it will just sit there forever waiting for the end to come, then return nil. To work around this, we need to peek at what Ripper does under the hood.

00:13:34 Ripper is a parser, and very roughly, parsers have two major steps. The first one is taking that top line and breaking it apart into chunks that are meaningful to the programming language— in this case, Ruby. Then, they take these chunks and convert them into a deeply nested structure. This middle line resembles what we want: breaking down the line of code into its individual components.

00:13:57 Thankfully Ripper exposes a method to get exactly that with `Ripper.lex`. Lexing refers to breaking something into meaningful chunks (a process often referred to as tokenization). Remember how earlier I was just splitting out lines using spaces? What I really produced was a bad version of tokenization, so let’s replace that with proper tokenization.

00:14:20 We can call `Ripper.lex`, which returns a large list of tokens. Then we iterate through these tokens, exempting those we don't care about and defining everything else as a variable. Alright, we're on the right track, but I still have to manually handle things like `puts`, whitespace, and presumably a ton of other tokens.

00:14:46 If I were to lex a simple line of Ruby code, it gives us this large list of tokens. For each token, it provides four pieces of information: the first is the position of the token (its line and column number); the second gives us the type of token it is; then there's the token itself as a string; and finally, some information about the tokenizer state. We're going to ignore that last bit for our particular purposes.

00:15:01 However, what I actually want to focus on is the type of token Ripper thinks it is. If it identifies a token as an identifier, we can process it. So now we're back at our TracePoint where we lex each line into tokens and check if each is an identifier.

00:15:17 If it is an identifier, we define it to be three because three is the best number. This nearly works, really close! Although it also completely destroys the built-in `puts` method because now it'll recognize `puts` as an identifier too and redefine that as well. Well hey, who needs to print things out, right? Right, we do sometimes, so let’s make this a bit smarter.

00:15:37 Here I'm doing the same thing by finding identifiers and checking if that identifier already exists in the current scope. If it does, I ignore it, but if it doesn't, I define it. This works: a truly generic just-in-time variable declaration without the hassle of our previous tokenization process.

00:15:58 Alright, let’s get back to our horrific goal: the vequals operator. We have reached a point where everything runs. Our vequals operator works since we defined it as a method, and `x` works because we just-in-time defined it. While it doesn't do what we want yet, it also doesn’t crash. That's a step in the right direction for us, and a misstep for anyone who deals with our metaprogramming nonsense later.

00:16:13 Now, we want to make sure that seven gets assigned to `x`. This will be a little more complicated and I can’t fit it all on one slide, so let’s look at our high-level approach. We want something like this to function; let's break it down by lines and columns because that's how TracePoint and Ripper will perceive it.

00:16:47 The TracePoint will trigger on each of these lines, starting with the top one. However, I must clarify that Ripper will skip over the first line because it's just a literal expression that doesn't call anything; both Ripper and TracePoint will skip it and fire instead on this line.

00:17:13 When we hit this line with our TracePoint, we will lex it for any identifiers that match our vequals operator. If we find any, we will backtrack a line and lex that line for expressions we find related to that operator. We'll stash the column position and the expression we find. For example, we locate the vequals operator in column one, pointing to the seven.

00:17:36 However, instead of retaining the expression itself, we need the value being assigned, so we’ll evaluate the string. For example, the string `seven` evaluates to seven, and voilà! We're done with this line. Moving on to the next line, we will again lex it for any identifiers.

00:18:03 We find the identifier `foo`, which spans column zero to column two. We’ll check if this identifier is located below the vequals operator from the previous line. Since we confirmed it is, we will proceed with the assignment, defining that identifier to the value it points to, which, in this case, will be `seven`.

00:18:27 This effectively captures the entire process. For each line, we check for any vequals operators that need processing, and then we look for identifiers on the current line that line up with vequals from the previous line. Here's the combined code; there’s no need to scrutinize it too intently; it's just the previous codes wrapped in a class, adhering to our standards of avoiding pollution of the global namespace.

00:18:58 This code has the two main steps: checking for vequals operators on the current line and processing those from the previous line, along with a few helper methods. Here it is, in all its haunting glory. I won't make you pore over it; you might hurt your eyes.

00:19:19 But the takeaway is that it works! Once we enable the vequals operator, we can assign seven to `foo` and print the result, and it will behave as expected. However, we’re not stopping there because this is far too rational! We can, for instance, have multiple vequals on the same line, so long as every other line features syntactically valid Ruby, we should be good to go.

00:20:01 Heck, we can mix and match our equals operators; why not? The real strength emerges when we combine it with other forms of assignment. Witness the fully armed and operational vequals operator as we create waterfalls of assignments! This works, I promise! `foo` will see its value updating to `seven`. And, heck, we could use this as some absurd spread operator!

00:20:45 The fun continues as we can leave delightful puzzles for our colleagues, who will undoubtedly thank us for brightening their day when they discover our nonsense three months down the line in the production code base. Can you guess what will be assigned to `z` at the end? I sure as heck don’t!

00:21:21 And obviously, the logical conclusion of this? I'm going to tell my kids that this is what software architecture consists of— the only limit is line length and taste, and hopefully at this point in the talk, it’s clear which of those I possess!

00:22:01 Now, obviously we must push forward towards upward assignment. Unfortunately, I don't think it's feasible; it appears it would violate causality. But of course, I foresee one of you figuring out how to make it happen before I conclude this talk, so I leave some of these challenges as exercises for the reader.

00:22:29 To be clear, I'm only half-joking. If any of these seem reasonably possible or nearly so, give it a shot! Projects like this are not about crafting production-ready vertical assignment operators for daily use, it’s about exploring the boundaries of what is achievable in a language. Pushing these limits is one of the best ways to deepen your understanding of it.

00:23:06 It allows you to utilize tools like Ripper and TracePoint that you may not typically encounter in your day-to-day work. I certainly hadn’t until I undertook this project. This experience broadens your toolbox of techniques, making you a better developer.

00:23:31 Additionally, it lets me relieve the urge to write this kind of terrible nonsense during my free time, so I return to work prepared to write reliable, less chaotic code. But yes, in your spare time, let your creativity run wild, because Ruby is the one language that allows you to get as weird as you want.

00:23:52 Now, I would like to do Q&A. But because I love the sound of my own voice entirely too much, I’ll pose the first question myself.

00:24:04 The first question I always get when I do this sort of absurd coding is, 'Who are you, and why have you done this?'

00:24:12 Well, okay, my name is Kevin Kuchta. You can find me in various places online. Feel free to reach out; I love chatting, especially about other people's crazy coding endeavors!

00:24:31 Because, like many, people can't help but rubberneck at a car crash, I often get asked, 'Where can I see this code in greater detail?' You can find it on my GitHub—it's the same code with all the bad ideas, but with more comments.

00:24:52 And the question that always surprises me yet somehow arises frequently is, 'Should I use this in production?' By now the answer should be clear: absolutely! Please do, and tell me how it goes; it will be hilarious!

00:25:08 Unless you are one of my coworkers!

00:25:16 However, if you’ve grown tired of hearing me speak to myself and have a legitimate question, feel free to hit me up after my talk!

00:25:28 I will be around, also on the conference Discord or online, wherever you can track me down.

00:25:38 Finally, I know there's one more question burning in every heart, and that is, 'Kevin, can you please give me a 20 to 30 second pitch for your employer?' Well, yes, thank you for asking!

00:25:44 I work for a company called Daybreak Health, specializing in mental health for teens. It is easily the most fulfilling job I’ve ever had. We have saved lives in the short five months I've been there.

00:26:03 We are a nine-person company with two engineers and three code bases. That’s too many code bases! Please come work with us. We're looking for a senior full stack or back-end engineer.

00:26:28 That is the talk. I hope you enjoy the rest of the conference. I hope I have inspired all of you to pursue greater avenues of Ruby hackery. Thank you for attending!

RubyConf 2021