Investigative Metaprogramming

by Betsy Haibel

The video from RubyConf 2019 titled "Investigative Metaprogramming" by Betsy Haibel explores strategies for using metaprogramming as a debugging tool in Ruby, particularly when dealing with elusive errors. Haibel opens with a relatable anecdote about a challenging debugging experience while upgrading a Rails application, highlighting how traditional search methods often fall short when facing unique errors that are not well-documented online.

Key points discussed include:

- Definition of Metaprogramming: Haibel delves into the definitions of metaprogramming, contrasting academic definitions with practical, everyday usage in Ruby. She emphasizes how metaprogramming in Ruby differs significantly from other programming languages, primarily due to Ruby's dynamic nature.
- Investigation and Debugging Techniques: Haibel outlines various techniques for inserting debugging hooks through metaprogramming, which include:
- Using reflection to understand the structure of objects.
- Recording behavior through dynamic language features, such as the TracePoint API.
- Reviewing recorded telemetry to trace back errors.
- Example of a Debugging Process: Haibel details the process her team undertook to identify why a hash was freezing unexpectedly during a Rails upgrade. By using metaprogramming to dynamically insert debugging logic, they were able to pinpoint where the issue occurred, drastically reducing their debugging time from days to just hours.
- Importance of Contextual Understanding: The talk stresses that the goal of debugging with metaprogramming is not merely to understand every aspect of the program but to effectively resolve issues and fix bugs by observing behaviors and responses in real-time.

Ultimately, Haibel concludes that using metaprogramming enables developers to create their own tools to inspect and understand code better, leading to a more empowered programming experience in Ruby. This approach promotes flexibility and a deeper relationship with a language that actively involves the developer in the debugging process, encouraging a shift away from rigid frameworks and embracing Ruby's dynamic attributes.

Throughout her presentation, Haibel seeks to inspire developers to leverage metaprogramming for investigative purposes, transforming frustrating debugging encounters into moments of efficient learning and problem-solving.

00:00:11.830 All right, so today we're going to be talking a bit about using metaprogramming to debug issues that would otherwise be extremely difficult to figure out. Now, speaking of difficult to figure out, I want to say up front that there are a few bugs in my keynote animations.

00:00:21.279 How many of you have done a lot of CSS or a lot of SQL? We know that when programming declaratively, like in CSS or SQL, it can be incredibly efficient about 95% of the time. However, there’s that 5% where you and the interpreter have a very different opinion about reality. I'm bringing that up not just as a disclaimer and an apology, but actually as a content warning. If you have cognitive issues or other sensitivities that mean that seeing stuttery animations might be difficult or dangerous for you, I unfortunately cannot promise that my talk will be entirely free of that.

00:00:44.650 So continuing on in the spirit of talking about bugs, it’s time for another scary debugging story. I know it's like three weeks after Halloween, and we’re supposed to be over that now, but bear with me. About a year ago, I was working with a consulting client on a Rails major version upgrade. In the course of this project, we encountered complicated metaprogramming that interacted strangely with the Rails autosave internals.

00:01:19.070 This metaprogram code had been written about a decade before by contractors who'd worked on the original Greenfield, many of whom left the project shortly after and had not been present at all in the intervening decade. It all started when the tests failed with this one obscure error deep into the Rails internals, complete with a stack trace that was about three screens long.

00:01:58.520 We tried googling, but StackOverflow was completely useless. There was this one GitHub issue for path debris that kind of looked right, and we thought maybe because we had just switched from mechanized to factory taught, since mechanized didn't work with Rails 5. However, following up on the advice we received didn't pan out either. So we were in an unhappy place.

00:02:34.290 At this point, you’re probably guessing in your heads about the epic road ahead of us. Maybe it took us a few days to solve this, a few weeks? No, it was actually just two hours. There wasn't that much to it; even we knew the problem we were seeing was related to 'can't modify frozen hash.' Yet, we realized that understanding that was not very useful in the moment. What we really needed to know instead was where the hash froze. You can’t rewind time from an error, but you can add debugging logic when replicating a bug. We were able to use some simple light metaprogramming to insert debugging code and identify where the hash was freezing.

00:03:41.209 We followed a pretty simple process that fits on a slide. We added some telemetry, and then later on, we looked at the relevant data that the telemetry had surfaced. This would honestly be a very boring story if it weren't for the fact that it involved metaprogramming and that it turned a two-week bug into a two-hour one.

00:04:06.570 Now, currently, the more pedantic among you may be wondering where the metaprogramming comes into play. So I’d like to take a moment to define metaprogramming. You may be worried that this is a horrible detour and that this talk is careening off the rails.

00:04:19.490 When I was practicing this talk in front of my team at work, I hadn't written my speaker notes yet and was kind of improvising. I mentioned the ‘off the rails’ bit entirely straight-faced without intentionally making a pun, but then my teammates laughed as if I’d done it on purpose, so we’re all stuck with that joke now. Anyway, this talk concerns itself with three questions. First off, how do we define metaprogramming? In the context of Ruby, it turns out that the way that metaprogramming works in Ruby is pretty different than how some other languages handle it.

00:04:59.060 That's actually related to why Ruby can sometimes be hard to debug with other language tools. After that, we’re going to look at some peculiar, Ruby-like ways that we can use metaprogramming to investigate what's going on inside the Ruby VM. Finally, we’ll return to the story I just told you and examine it in more depth. We'll talk about how we can apply these investigative metaprogramming techniques in real life to identify bug root causes much more quickly than we could otherwise.

00:05:34.169 So for starters, what is metaprogramming? When we talk about metaprogramming, we tend to superimpose two definitions in our heads. There’s the academic definition, which we’re taught to use in technical interviews, and then there’s a slightly more pragmatic definition that reflects how most of us actually think about metaprogramming. One of the common academic definitions is code that modifies the structure of the program at runtime.

00:06:03.610 According to some extensive Twitter research on my part, the most common pragmatic definition is code that I can't analyze using a full-text search. What's interesting to me is the way that these definitions conflict; there's a hole you can drive a Mack truck through, and that hole is the existence of 'send' in Ruby. We don't really call methods on objects so much; that’s just syntactic sugar, which means it's a distortion we created to make Ruby look more like other languages. In Ruby, we send messages to objects, and those objects decide how they want to react to those messages.

00:06:54.310 Whenever you call a method in Ruby, it's really just an alias for calling 'send.' This means that if part of your code base calls 'send' with a variable value rather than a static value, you can't use full-text search to determine all the places that methods on that object are used. Using 'send' with variables falls firmly within most people's practical definition of metaprogramming, even though it doesn't do anything like modifying the structure of the program at runtime.

00:07:30.620 Now, maybe we can get out of this hole by trying to modify the academic definition. Perhaps it’s not just code that modifies program structure at runtime, but also code that determines the method to be invoked. But then, oops, that definition is even more vulnerable to scrutiny. Is there any practical distinction between determining the method to be invoked at runtime by using 'send' with a variable and determining the exact same thing with an if-statement? Not to me, at least not from a theoretical perspective. Yes, you can full-text search for one and not the other, and that is a very real difference. But it’s not easily captured by framing this in terms of what happens when.

00:08:58.440 And while I'm playing the 'well actually' game in my own talk, I'd like to zoom in on this 'at runtime' part. Ruby is an interpreted language, and sometimes the distinction between load-time and runtime breaks down. Some aspects of Ruby’s structure make that difference especially meaningless.

00:09:30.109 Let's take a zoom into the Ruby interpreter and how it loads a basic class. We've all written code like this before, a simple class with one getter and one setter. After our Ruby file has been parsed, the interpreter goes through the generated abstract syntax tree node by node. It identifies a node that defines a class and builds a class from the virtual machine. Every time we define a method, it adds that method to the list of instance methods for that class.

00:10:29.900 But under the hood, it's actually doing something a little more like this: it’s building Ruby objects in Ruby. When we define a class, what we're actually doing is instantiating a class object. When we define instance methods, we're adding method objects to the array of instance methods that that object tracks. Let’s look at an equally simple Ruby class but one that uses the 'attr_accessor' macro to define its getter-setter pair instead of defining the same pair manually.

00:11:11.020 Class macros are definitely something that many people think of as metaprogramming, but because 'attr_accessor' is so common—we type it every day—it’s easier for us to not consider it 'this is metaprogramming' in a big, scary capital M sense. So we started getting by defining a class, but then we hit the macro call to 'attr_accessor.' When we call it, we are sending the 'attr_accessor' message to the class.

00:11:52.470 Now we need to look that method up. 'Attr_accessor' isn’t defined on the yarn class itself, so we have to go into its ancestor chain to look up the method. 'Yarn' is an instance of 'Class,' and 'attr_accessor' is defined as an instance method on 'Class.' So, that's the definition of a direct accessor, which we use to define a getter-setter pair. 'Attr_accessor,' then, calls 'define_method' twice to define that getter and setter for the attribute name—in this case, 'length,' which we passed into 'attr_accessor.'

00:12:19.090 This definitely feels a little like metaprogramming; we’re calling 'define_method.' We’re using Ruby methods like 'define_method' to define the structure of our program. My question is: is this happening at runtime or at load time? My answer is that this distinction isn’t meaningful in Ruby.

00:12:55.860 We're using Ruby to load our classes, and so it's all a big mismatch. It's all happening at once. The definition we're taught to use for metaprogramming—code that modifies the structure of the program at runtime—works in other languages like Java where program structure is different from the data passing through the program, and where a defined compilation phase must complete before you can run the program.

00:13:36.190 In Ruby, neither of those assumptions underlying this definition are true. Ruby uses Ruby to build up the underlying structure of classes and methods that we invoke and treats classes and methods as just other kinds of objects. In Ruby, the best academic definition of metaprogramming I can come up with is: code that modifies the thing we would call program structure if we were using another language, just like it modifies the data passing through the program.

00:14:03.779 This definition is much more compatible with the pragmatic definition that all of us use when we observe Ruby magic. That’s because, at its core, the pragmatic definition isn't about full-text search. Instead, it’s about feelings—our feelings in the face of code that resists analysis via full-text search. And these feelings are, at least when I have them, approximately, 'Oh my god, I don't know what’s going on! How am I ever going to debug this? Oh god!'

00:14:49.470 These feelings are also why people from strongly typed language backgrounds often find Ruby a bit tricky initially. When you’re accustomed to having a compiler warn you about certain types of bugs, it’s intimidating to program in a language that doesn't provide the same assistance. These feelings stem from not having the same tools available to us from other languages.

00:15:35.720 This reality holds true even if Ruby was the first programming language you were taught, or even if Ruby is the only language you know. That’s because all of us were ultimately educated within a paradigm that believes program structure and program data are different. This paradigm exists because it was true in languages like Pascal and Java, the traditional first languages that most individuals learned.

00:16:03.930 Now, two cases arise from this teaching. The first would be victim-blaming, implying that real programmers—whoever they are—don’t need tools to understand what their program is doing. I’m not about that. I believe we still need tools, but they should be Ruby tools. Additionally, these tools should be applicable to Ruby codebases in their original forms.

00:16:43.909 I've worked on many decades-old codebases, allowing you to date any piece of code by the style in which it was written. Our community’s opinions on how to write Ruby continue to evolve, and if my toolset only works with a specific dialect of Ruby that I’ve defined as 'good Ruby,' it becomes restrictive as a developer. It would mean I could only work within an arbitrary, personal notion of 'good code' and lock myself to a particular point in time.

00:17:12.720 The correct tool, therefore, is to apply more Ruby to the problem—essentially countering other people's metaprogramming with metaprogramming of our own. I mean this both in the vague sense of Ruby magic, where code violates our expectations about static program structure, and in the stricter sense of code that treats structural elements like current stack traces the same way it treats data and functions.

00:17:45.340 If we can manipulate structure like we manipulate data, we can assign the structure to a variable and print it out during debugging. That’s cool. One of the most powerful ideas in computing is letting go of the desire to predict the computer's behavior and instead simply asking the computer what's happening and observing the results. So now we’re going to look at three basic ways we can leverage Ruby’s dynamism to help us understand what's going on in our programs at runtime.

00:18:23.440 The first of these is reflection, which is computer science lingo for asking a program about its structure. The second is using dynamic language features to record what our programs do and when. The third is using those same features to help us more efficiently review what has transpired. The most basic way we can use reflection in Ruby is by calling 'self' to understand what the current object is, and then we can proceed from there.

00:19:13.550 We can ask ourselves what class an instance is of. In this case, a 'wool yarn' instance turns out to be of class 'WoolYarn.' This information isn’t necessarily very interesting by itself, but it does allow us to move on to some much more interesting places. For example, once we have a class, we can look up its ancestor tree. Once we have an ancestor tree, we can see which of the class's ancestors define a given method.

00:19:58.090 Furthermore, we can print out all of the possible method definitions. For example, we can learn that a subclass 'WoolYarn' completely overrides the definition from the fiber class, which is beneficial because that definition would have raised an error. Like I said, some bugs have occurred in my animations, but it's fine. Now, I’m only scratching the surface here. There are countless aspects to explore about any given object or binding in Ruby, and I encourage you to open up IRB or a debugger and start experimenting.

00:20:33.820 Call '.methods' on various objects, which tells you what methods are available, and then call some of those seemingly metaprogramming methods to see what happens. After a while, however, you will likely get tired of typing 'Yarn.methods' and other similar commands to discover what ‘normal-looking’ methods are available. In these moments, you may feel tempted to either locate a gem or to build one. Thankfully, Pry exists and does a lot of work for you.

00:21:25.590 Using Pry can be quite beneficial. Rather than instinctively searching for gem solutions, I suggest storing one-off methods you use for analysis in an IRB or Pry configuration file. This way, they’ll be loaded anytime you’re in a relevant context. The ultimate goal of using metaprogramming techniques to explore your code is to expand your understanding of Ruby’s inner workings. You can import a library for making HTTP connections more efficiently and successfully utilize it without deeply understanding the intricacies of the HTTP protocol.

00:22:08.390 The goal in using most gems is to shield you from needing to understand specific details. And truthfully, I love that, but if you're shielding yourself from understanding these essential components, it becomes quite difficult to learn about them. You cannot simply type 'require' and suddenly grasp Ruby's object model.

00:22:44.310 So anyway, moment of philosophy over. Let’s discuss recording. Recording is about understanding your code over time—it involves taking snapshots of what has happened so you can grasp later behavior better. There are two primary methods for inserting recording logic. The first way is to use 'prepend' to attach additional logic into a class, and the second is to employ the TracePoint API.

00:23:12.670 TracePoint is a vast topic that could easily consume an entire talk on its own, so I’m just going to provide a basic example. TracePoint organizes itself around events, with one of the most useful ones being when a line of code just got executed. Being able to trace this is incredibly powerful. For instance, in this code, we insert a simple if statement that compares a variable value to a tracking variable already storing its previous value. If that variable changes, we output where it changed.

00:23:41.120 Now, as important as how we insert that tracking logic is where we store the telemetry results. The simplest form of this is by using a variable, whether it’s an instance variable or a global variable. Yes, by the way, Ruby does have global variables, although we seldom see them in production code, as they're typically a poor idea. However, since this is debugging code, who cares?

00:24:09.710 You can also store information into a logger, which becomes incredibly powerful if it’s a smart logger that formats output into a searchable or filterable format. One question I’ve often faced is where to place code like this, as it doesn’t neatly fit into typical project structures. My most honest answer is: do what works for you. Personally, I either create a file in the initializer directory of a Rails project I call 'debugging.rb' or make something similar in a test support directory.

00:24:48.850 I never check these files into version control, and often I delete them after solving the problem. If a similar technique works for you, great! If something else works better, then go with that. I don't believe there's a right or wrong way to do this. Finally, let’s discuss reviewing the items we record. There are two primary methods for doing this: one is inserting a debugger breakpoint and the other is adding a print statement.

00:25:31.390 Many of you might be groaning internally right now because you are already doing these things. However, Ruby's dynamic nature means you can implement these actions conditionally, which is pretty cool and efficient. How many people here have inserted a debug statement in their code only to immediately remove it when it turned out that code was called far more often than they expected?

00:26:06.360 I see a lot of hands—there’s laughter! In other languages, you insert debugger breakpoints in your IDE, and they always trigger regardless of the context. But in Ruby, calling 'binding.pry' or invoking 'debugger' is just calling a method. Thus, we can set debug breakpoints if and only if a certain flag is true or similarly add print statements at sites of errors.

00:26:34.980 So let’s now contextualize these actions regarding real problems, like the bug I mentioned at the start of this talk. It’s important to remember that our goal isn't to understand everything our program is doing; it’s to fix bugs. This means we should uncover the root cause of the bug as bugs typically arise when our assumptions about what is happening deviate from the reality of what is actually occurring—essentially encountering a conflict. To isolate the cause of a bug, we need to systematically identify what we believe is true or what might be true and then prove or disprove them.

00:27:44.810 In this context, investigative metaprogramming serves as a technique for widening the set of questions we can answer in a reasonable timeframe. Here, the question we're trying to answer is: Did the hash freeze correctly or did it freeze improperly? To address that, we need to figure out when the hash froze. A naive way to do this would be to simply print out a message whenever the hash freezes, which we definitely tried first. However, ‘Rails’ traces hashes a lot, resulting in a huge volume of print statements. Sorting through that haystack looking for a specific error needle would take considerable time, and we had a reviewing problem.

00:28:41.930 In other words, we were not filtering enough while reviewing the data. To resolve this issue, it can be helpful to specify vague aspects in the questions we ask. Instead of asking, 'when is a hash freezing?', we should properly frame it as, 'when is the specific hash that is throwing the error freezing?' Therefore, the correct approach to reviewing this information is to implement a solution where we can observe the error site. While you're unsure about what to do next, that's fine, because we can think about recording problems through actions we intend to document.

00:29:32.360 In this case, our interest lies in when the hash is freezing, highlighting the fact that we should instrument the 'freeze' method. The question now becomes: with what should we instrument that? You guessed it—this involves reflection, where we consider the data we aim to collect.

00:29:46.760 In this case, the most useful piece of information would be the stack trace. By using this technique, we can pinpoint the exact location where an association was unexpectedly deleted, providing potential insights into why it shouldn't have been. We might not have solved the root cause, but we effectively detached a single piece of code from a troublesome old subsystem, allowing us to unblock a Rails upgrade. I think that's a significant accomplishment for two hours of work.

00:30:19.780 I emphasize the time factor because duration often represents our feelings. When programming tasks consume more time than anticipated, it triggers feelings of frustration, defeat, or anxiety. We then must endure these feelings throughout our challenging programming tasks. Hence, there’s been a backlash against free metaprogramming, monkey-patching, and all the elements that make Ruby unique compared to a conventional interpreter like Java—something we don’t want.

00:30:54.250 When I gave an early draft of this talk at work, my boss later pointed out that my underlying thesis was that in other languages, you fix problems by establishing libraries. In Ruby, we address issues by creating metaprograms. Much of the contemporary developer discourse focuses on prevention and protecting ourselves from our own mistakes, but there are two ways of executing that.

00:31:25.010 One way is by making frameworks so restrictive that we cannot possibly mess up, leading to the writing of boilerplate. This ultimately reflects an impulse to control not only ourselves but others. It transforms ourselves and our teammates into mere cogs in a machine. Ruby's greatest strength is that it allows us to rescue ourselves in contrast to that reality: we can save ourselves from anxiety and the influence of other people's metaprogramming by empowering ourselves, and I find that narrative much more appealing.

00:32:05.670 So, I'm Betsy. I currently work as an advisor. I used to be there full time but recently took a full-time position at the Long-Term Stock Exchange. I'm an advisor at Cohere, a consulting company dedicated to making you a better programmer. If you want a two-hour staring session with one of my co-founders, similar to the one I had while solving the bug I discussed, email them at weekohere.com, and I’m sure they’d be thrilled to arrange a time to work with you.

00:32:51.198 If you're less commercially oriented, we have a newsletter that features insights from my co-founder, Jennifer, on engineering, engineering management, and how to become a better person or teammate. This is not shameless self-promotion; Jennifer is amazing and you should read what she writes. Recently Jennifer, Marlena Compton, and I co-authored a series called 'Let's Pair,' which you can purchase at letspair.guide. It's about being a pair programmer in a more friendly context, even when pairing across skill or experience gaps. Thank you!