Your Bright Metaprogramming Future: Mistakes You'll Make (and How to Fix Them)

Talks

Betsy Haibel

Your Bright Metaprogramming Future: Mistakes You'll Make (and How to Fix Them)

by Betsy Haibel

In the video titled "Your Bright Metaprogramming Future: Mistakes You'll Make (and How to Fix Them)" by Betsy Haibel, the speaker explores the nuances of Ruby metaprogramming, its applications, and common pitfalls. Betsy Haibel provides a structured journey through the stages of learning metaprogramming, namely: being scared and tentative, feeling empowered, experiencing regret, and achieving maturity. Here are the key takeaways from the talk:

With a mix of personal experience and practical advice, Betsy Haibel's talk equips Ruby developers with the knowledge to navigate the complex world of metaprogramming effectively.

00:00:18.240 Hi everyone, my name is Betsy Ha, and we are here today to talk about metaprogramming. Let's start off by getting some consensus about what metaprogramming is.

00:00:30.560 There are a lot of different definitions that people use. When I was explaining this talk to my parents, I used one of the more common definitions: code that writes code. I think this definition is both inaccurate and a little overly broad, to the point of not just being meaningless, but also misleading. It can apply to C generators, it can apply to quines, and it can apply to a lot of things that output strings.

00:00:54.160 Also, it makes your parents think you're building Skynet. Then there's the definition of metaprogramming that I think a lot of people instinctively use when they're intuitively thinking about it, which is stuff that's magical and hard to grasp. For this, we also can describe Perl, so again, we reject this definition.

00:01:18.119 I think the most useful definition you can have for metaprogramming is code that treats the structure of the program as just another data structure to manipulate using code. What does this mean in the wild? Let's start with the `send` method, which is a pretty basic building block of Ruby, most frequently used in the wild to invoke private methods, with Ruby's privacy model being more of a suggestion than any real binding thing.

00:01:48.079 Under the hood, what the `send` method does is take Ruby's message-passing model of method invocation and make it explicit. Ordinarily, this is hidden behind syntactic sugar. The dot in `object.method` sends the method as well as any arguments you include with the method invocation to the object you're sending the message to.

00:02:05.000 But using `object.send` makes this a little more explicit. `a.bar` and `a.send(:bar)` are semantically equivalent, but when you use `send`, you're making it really clear that what you're actually passing is a method name and some arguments, packed together into a bag of arguments, the first of which just happens to be the method name, which just must be a symbol or a string.

00:02:30.440 This first argument has some special status as it directly determines what block of code is called next. It is almost as if an object is just a giant case statement saying, 'Hey, do this if that.' However, all the arguments in the bag of arguments are just ways of passing information to an object about what you want it to do next.

00:03:00.000 So moving on, let's talk about how it can be used and misused in the wild. When I was a very inexperienced coder, literally on the Friday of my first week at my first Ruby job, I discovered `object.send` and what it could do. I had been working with the Delayed Job all day to create a synchronous transactional email. For those of you not familiar with this library, at its core is the method `send_later`, which does pretty much what it sounds like.

00:03:44.280 This method queues a job that invokes the send method name along with any arguments passed. I was looking at it in the context of a method that looked a little like this: a straightforward Rails observer that looked for attribute changes and either triggered emails or queued the email for sending if appropriate.

00:04:11.280 I didn’t like this code all that much. It got the job done, but I didn’t like how it got it done. I trained in C, Java, and Perl, so I was used to method names being invocable things that were entirely separate from the data that the methods acted upon. Coming from that mindset, I accepted a certain amount of repetition and boilerplate as inevitable. You write 30 getter and setter methods for class attributes, each of them one line long and as formulaic as a Garfield strip.

00:04:55.840 I resented this, but I didn't know how to improve upon it until I looked at the `send_later` method and realized that if there was a `send_later` method, it must be playing off of the name of something called `send`. I started Googling, and about 20 or 30 minutes later, the code looked a little more like this.

00:05:29.760 I identified a pattern expressed by repetition and captured that pattern with code, namely by invoking methods dynamically using `send` rather than statically with copy and paste. This was just another view of the same code, and it felt a lot more like my expectations of what Ruby should be. If you have issues with animated GIFs, close your eyes for a second. It felt a lot more like this. And you can open your eyes again if you closed them. This is how it's supposed to work.

00:06:30.960 I came from a background where I couldn't do this thing, and then I learned more of Ruby, and I could, and it was awesome. I felt really great because I had just done something impressive at 4:30 PM on the Friday of my first Ruby job. It was a contract because I had no experience, so they were all, 'Prove yourself.' So, I felt great because I had done something substantial for my job security.

00:07:05.680 I also learned the wrong lesson. The next code I'm about to show you is an amalgam of code that I wrote back then and code I've seen other people write. All names have been changed to protect the guilty. So we have some code that searches within a set of models and then sends their data onto the formatter.

00:07:31.680 As I mentioned earlier, a lot of the time, when you're using dynamic method calls, you're using them to express a repetition that had previously been expressed using copy and paste with some variable names swapped out. This is definitely what's going on here. Reflexively reaching for dynamic method calls to fix this problem presents three traps, however.

00:08:34.640 The first is that it breaks the Single Responsibility Principle (SRP). Ordinarily, when I'm skimming through new code, I look for method names to see where the definitions are invoked. But I can’t find where `format whatever results for some user type` is defined because I don't know what method name is being sent.

00:09:40.120 Similarly, if I'm looking at the definition for `format pants results for anonymous users`, I'm not going to see that it’s being called here, so I won't get an accurate understanding of the context in which the method is used. The second issue is that dynamic method calls like this introduce cyclomatic complexity.

00:10:16.480 Cyclomatic complexity refers to the number of paths that data can take through your code, and high cyclomatic complexity is problematic because it means you need to keep track of more potential outcomes for how your code will behave. Conditional statements like `if` and `case` statements increase this, and if you’re invoking methods dynamically, since there’s a limited number of methods you can actually be invoking, you’re essentially introducing a gigantic case statement into your code.

00:11:13.120 Hiding this implicit conditional complexity under the guise of dynamic method invocation doesn’t eliminate the cyclomatic complexity; it merely obscures it. The third thing is that using a dynamic method call to reduce repetition at the call site often just masks some far uglier repetition that's taking place elsewhere. This is also what's happening here.

00:12:06.080 So let’s fix it. I want to give a quick warning that this is a refactoring where things are going to get a little uglier before they look nicer, so bear with me. The first thing we need to do is change the method signature a bit. When you do a dynamic method call, you’re essentially cheating message passing by sending multiple pieces of data with the method name argument.

00:12:50.440 So instead, we can change to a method that explicitly lists the garment and user type as arguments, promoting them to first-class arguments. The quickest way to accomplish this is to switch it up in the call site and move the `send` call into the formatter, reducing the number of lines of code we need to change at once.

00:13:37.679 While this seems like a trivial refactoring, it is not, because moving the context of the dynamic method call into the class pairs it with the context, making it easier to understand. The second step is to recognize the repetition of results arguments. We have many method calls that are all being passed results, and we can eliminate this redundancy by turning the formatter into a real class instead of just a collection of class methods.

00:14:35.880 Next, we should look for more repetition. There is some low-hanging fruit here, as both the `format_sweater_results_for` methods share a lot of the returned data. So, we can move that out into a `sweater_attributes` method and ask the `sweater` class for that data rather than asking it to return the data right away.

00:15:37.680 This change will make the somewhat ridiculous indirection we have here a bit more obvious; we’re asking the `sweater` class what data we want from it instead of just asking the `sweater` object to return data in the first place. So we switch that up, and we do the same on `hat`, just for good measure.

00:16:20.480 We also update the formatter to reflect this. The formatter class looks much smaller and prettier now. Since it’s smaller, we can also see other repetitions that were harder to see before, which is one of the nice things about incremental refactoring; you always discover something new.

00:16:57.440 For example, both the `format_whatever_results_for_user` methods are identical. By using a dynamic method call earlier in the chain, we were obscuring some repetition, which means it wasn’t really necessary.

00:17:22.120 Thus, we can collapse those methods into just `format_for_user` and `format_for_admin`, completely eliminating the `garment` argument as it is superfluous. Again, we couldn’t see before that it was unnecessary, but it is now that we’ve refined our code.

00:18:15.840 If we want, we can also get rid of that last dynamic method call for `user_type` by instead moving to a polymorphic approach with form matters. Here, we’re using `const_get` to pick out the right form matter. If you dislike `const_get` or dynamic calls in general, you can achieve a similar effect by using a hash to store explicit class names.

00:19:07.760 In all honesty, I think that’s overkill. We’re not fully escaping from dynamism either in the const_get version or even the hash version; we’re merely determining which class to instantiate dynamically.

00:19:49.200 However, the dynamism is contained within a single file—more importantly, within a single family of classes—and this means that its impact on the cyclomatic complexity of the application is reduced to that family of classes rather than being spread throughout the whole application, which eliminates the need for you to consider it all the time.

00:20:38.760 Similarly, it greatly minimizes its impact on searchability since you no longer need to search for these methods except when you’re in the same file where you can just observe them. You may isolate the negative consequences of the dynamism while still enjoying its benefits. When I discuss when you shouldn't metaprogram in this talk, I don't want to lose track of the fact that Ruby's capacity for dynamism is one of the best things about the language.

00:21:40.160 I mean, look at how concise and pretty this nice little `const_get` call is. You'd need to write many more lines to achieve the same outcome in a less dynamic language, which is why I had that gigantic case tree for the mailer code earlier. It’s not only faster and easier to understand, but it's also just more delightful.

00:22:31.440 Next up, let's discuss `method_missing` and brace for a little less delight. So, let's look at these methods again. Like most of ActiveRecord, they are defined using `method_missing`, which allows us to dynamically respond to arbitrary method names. After all, a method name is just part of the bag of arguments sent to an object.

00:23:11.440 To explain how `method_missing` works, we first need to examine Ruby's default lookup chain. When you send a bag of arguments to an object, Ruby first looks at the object itself, or rather, technically the object's eigenclass, to see if any singleton methods matching the sent method name have been defined. Then, it checks the object's class, followed by the object's inheritance chain, including all the modules included or extended in this lookup chain.

00:24:03.440 If it finds nothing, it moves to the superclass and continues this process until it reaches the most basic ancestor, `Object`. If it still finds nothing, it gives up and raises a `NoMethodError`. So in the wild, `method_missing` functions somewhat like this.

00:24:35.840 This is an oversimplification of how ActiveRecord uses it; however, it conveys the general structure. `method_missing` receives the bag of arguments that you've sent to the object and decides what to do based on them. This oversimplified version resembles many beginner implementations of `method_missing`, including my own, which represents a common mistake.

00:25:11.200 The first error is that it’s not including a call to `super`. If you don't include a fallback to `super` in each `method_missing` definition, the lookup chain will stop at the first `method_missing` it encounters, possibly bypassing something useful further up the chain. The second is the absence of a parallel definition of `respond_to_missing?`. If you send this method name to `respond_to?`, it will return false because no method has been defined with this name on the object.

00:26:29.760 `respond_to_missing?`, when defined correctly, allows you to expand the notion of what method names an object responds to, and it's necessary whenever you're using `method_missing`. Failing to do so means you're likely lying to anyone who interacts with your object about its capabilities.

00:27:06.440 Now, you may see that with the improvements, it doesn’t mislead any longer. The third mistake involves not defining the method being called in the `method_missing` implementation. Method lookup via `method_missing` is slow if you use it multiple times since you must traverse the entire lookup chain each time.

00:27:47.800 For methods defined this way that you end up calling frequently, it's good practice to cache the method, or memoize it, by explicitly defining it using `define_method` within `method_missing`.

00:28:32.400 A subtler mistake you can make regarding `method_missing` is this. Imagine you have a mixin that depends on breed codes being defined on an object. For example, we're running a kennel with various breeds of cats. If one of your objects explicitly defines breed codes while another relies on ActiveRecord's `method_missing`, calling `breed_codes` on a cat yields an issue.

00:29:43.039 What happens? Since `breed_codes` is defined in the superclass's `method_missing`, the lookup chain starts with the eigenclass, then the class, where it finds the definition, which throws a `NotImplementedError`. This explicit definition comes before the implicit one, taking precedence, creating confusion.

00:30:34.720 To avoid this, consider using `super`. It doesn't matter whether something is defined higher in the method lookup chain; it allows you to access the implicit definition easily. Alternatively, you can opt for a more explicit route that stays within the mixin, getting rid of the debugging convenience of the `NotImplementedError`.

00:31:10.960 Considering all the work and boilerplate needed, one might argue that it's no longer convenient. However, asking these questions highlights some hidden costs of a `method_missing` driven dynamic interface. It doesn’t always gel well with good object-oriented design in Ruby.

00:31:58.920 Adopting it where you seek easy dynamism makes your code much less extensible later, meaning you should reflect on how much you need method_missing after all. As for when to use it, traditional applications include dynamic Domain-Specific Languages (DSLs) like Rails, ActiveRecord dynamic finder methods, and global configuration objects like `OpenStruct` or `Delegator`.

00:32:46.920 However, I don’t think any of these represent good use cases. Pretty DSLs are often better expressed with option hashes rather than embedding arguments within the method name. You can utilize `OpenStruct` for extensible config objects, and use Ruby's built-in `Delegator` classes for delegation. Both `OpenStruct` and `Delegator` do utilize `method_missing` under the hood, but it's shielded behind a well-maintained interface.

00:33:24.080 You cannot guarantee that this principle will hold true if you implement your own; most of the time, it doesn’t. Thus, how should we manage dynamic method definition when necessary? The `find` method is slightly slow, so if you're working where performance during load time is critical, you might reluctantly resort to copying and pasting.

00:34:13.920 Dynamic method definitions using `define_method` can certainly mask poor class design. Since it offers more flexibility and a clearer syntax than `method_missing`, I generally view it as more favorable; its syntax is straightforward.

00:34:57.760 You just pass in a method name and a block defining the method. Let's explore how it can clean up code. Here we have some repetitive method definitions. `define_method` lets us pull them into an `each` block, reducing both lines and illuminating the fundamental structure of the code.

00:35:43.920 Using dynamism to extract repetition can clarify and highlight that repetition is happening, which is essential. Suppose we subsequently isolate the attachment definition code into its own module for future reuse. Despite the fact that many pictures of cats cannot be seen in this photo, great job to RJS.

00:36:29.840 It's hiding a little in the bag and out of the screen, so it’s very well hidden. This code is not atrocious, but we can enhance its extensibility. If someone using this module wishes to override `photo_url`, their redefinition will reside on the same class—if they want to include our definition, they'll lack access to `super`.

00:37:05.880 Thus, they will either need to copy and paste our definition, leading to a brittle solution that breaks during upgrades, or they might have to rely on a workaround like Rails' `alias_method_chain`. There exists a straightforward way to address this.

00:37:33.960 With dynamic module inclusion, you can define instance methods on a dynamically created module using `Module.new`. This module can be anonymous, in which case you'd include the variable in which you stashed this dynamically created module. Conversely, if you're not keen on anonymous modules, you can store the module in a constant and then include it.

00:38:31.839 This strategy places the module between the object and the rest of the lookup chain, enabling convenient access to `super`, and it makes it easy to extend the method later.

00:39:14.639 Next up, let's talk about Ruby hook methods. There are two categories of hook methods: the normal ones and the weird ones. The normal ones, upon which you will frequently rely, include things like `inherited`, `included`, and `extended`, which trigger whenever a class inherits from, includes, or extends a class/module.

00:40:36.360 The weird ones, which are less common, include methods such as `method_added`, `method_removed`, `singleton_method_added`, and `method_undefined`. If you want to dive deeply, you’ll discover even more fringe ones. My singular advice about hook methods, both normal and weird: don’t.

00:41:21.719 Regarding `self.included`, I recognize that what I’m about to say may be controversial. To illustrate, we’ve seen this pattern adopted in the wild quite commonly; however, I find it problematic. Essentially, it offers an illusion of Ruby supporting multiple inheritance.

00:41:53.840 In truth, we’re distorting the language to compress something that should be a class into a module to inherit from it using Ruby’s mixin functionality. This ends up twisting the language into convoluted forms to accommodate something with both class and instance methods, making it no longer an object.

00:42:35.920 What we should be doing instead is transforming this object into a first-class object and attaching it to a calling class using a simple class-level macro. This is precisely what the `CarrierWave` library implements. That photo example I used earlier is almost directly pulled from the inner workings of CarrierWave and extends into the direct upload and processing internals of ActiveStorage.

00:43:33.680 Using this approach allows us to neatly access a bag of functionality while cleverly separating the actual processing mechanics into their distinct objects. The result protects the single responsibility principle and prevents unnecessary bloat on an object’s surface.

00:44:37.239 So, to summarize, I hope you take away the notion that it's all data to Ruby, so it should be to you as well. However, as you embark on your bright metaprogramming future, do remember your future maintenance coder, who will ultimately be you in about four months.

00:45:06.640 Reflect on whether you can search for the code, extend the code, and debug the code, as you will need to do at minimum the last one. Always assess whether you are using dynamism for appropriate reasons or if could express the same thing more elegantly using Ruby's language structure.

00:45:54.800 This is me again—I'm Betsy Ha. You can find me on the internet in several locations. Presently, I work for a wonderful DC-based e-commerce company called Optoro. Just a quick note, they kindly paid for my trip here, and they’re fantastic employers. We're hiring junior developers, so see me or one of my co-workers, some of whom are speaking here as well.

00:46:29.559 If you’re not a junior developer, don’t worry; we can cover all your laptop needs at Blink.com. Additionally, consider checking out my colleague’s talks tomorrow: Chris Hoffman is presenting on mathematical coercion in Ruby, Brock Wilcox is doing intriguing work with time travel and debugging, and Josh Schmida will delve into translating algorithms between Haskell and Ruby.

00:47:14.080 My slides are technically posted on GitHub Pages at the URL below, but as you can see, they suffer from some image issues. I will be rectifying that following this talk. All cat pictures, including the ones you cannot entirely see, are credited to my neighbor, Nikki Murray.

00:47:59.440 Does anyone have any questions?

Betsy Haibel

@bhaibel

RubyConf 2014