Talks

Messenger: The (Complete) Story of Method Lookup

Messenger: The (Complete) Story of Method Lookup

by Jay McGavren

In the talk titled 'Messenger: The (Complete) Story of Method Lookup' presented by Jay McGavren at RubyConf 2015, the complexities of Ruby's method lookup process are explored. The discussion is divided into two main parts: methods on classes and methods within modules, emphasizing how Ruby looks for methods during the execution of code.

Key Points Discussed:

  • Method Lookup Basics:

    • Understanding where Ruby searches for methods when they are called on an object, starting with the object's singleton class, then proceeding to the class, and finally the superclass if necessary.
    • The ancestors method is used to observe the order in which Ruby will look for methods and how the inherited method lookup process works.
  • Singleton Methods:

    • Singleton methods defined specifically on a single object can be used for testing purposes, replacing functionality temporarily to create predictable outputs in tests.
  • Method Overriding:

    • If a method is defined in both a subclass and its superclass, the subclass's method will override the superclass's method. Use of the super keyword allows for invoking the overridden method from the superclass.
  • Class Methods and Top-Level Methods:

    • Class methods behave as singleton methods on class objects, and top-level methods are private instance methods defined on Object, accessible from any class.
  • Modules as Method Containers:

    • Modules can be mixed into classes, and the method lookup works similarly to inheritance, either using include or prepend to influence the method resolution order.
  • Refinements:

    • Introduced in Ruby 2.0, refinements provide a way to safely alter methods for specific contexts without affecting global behavior, as demonstrated by changing the capitalize method in a localized scope.

Conclusion and Takeaways:

  • Ruby's method lookup process can be intricate but follows a systematic approach, starting with the singleton class of an object, proceeding through the class hierarchy, and account for methods defined in modules.
  • Understanding method overriding, inclusion, and the use of refinements significantly improves code organization and allows for customizable behavior without widespread impacts on existing classes.
  • Utilizing the ancestors method can aid developers in understanding how Ruby resolves method calls, which is crucial for effective coding practices in Ruby.
00:00:15.320 All right, I want to make time for questions, so I'll move at a brisk pace. If I'm overwhelming you, just raise a hand, and I'll slow it down a little bit. So, how y'all doing? I'm Jay McGavren, and I wrote this thing, and this is Messenger: The Complete Story of Method Lookup in Ruby.
00:00:24.320 This talk's going to take place in two major parts. The first part is all about methods on classes, including some places you might not expect them to be. The second part is all about modules.
00:00:33.719 For the classes part, we're going to start in a slightly unusual place: singleton methods. These are methods defined on only a single object. If you've ever stubbed methods out for a test, you've used singleton methods. For example, we have a WebSpider class that sends out an actual HTTP GET request across the network and returns whatever HTML it gets back from the remote server.
00:00:48.399 You probably don't want that in your test, though, so in this test right here, you can see it creates a new WebSpider object. It calls `get`, sending an actual network request. You don't want your test to have to wait for that, and you don't want whatever random response you get back from the remote server. You want it quick and predictable. A good alternative for that is writing a singleton method to override it. Here's a quick singleton method: we create a WebSpider object and define a new `get` method that just always returns a static HTML string. It's fast and predictable.
00:01:20.439 Now, let's take a look at how method lookup works with those. To create a singleton method, we need an object for starters. Before we can create an object, we're going to need a class. Here's my class. When we create an instance of my class, behind the scenes, Ruby will create a singleton class specific to that object.
00:01:41.119 You can access any object's singleton class via the `singleton_class` method. So, there's the output you'll see if you call `singleton_class` on our new object. Now, why does Ruby do this? The reason is consistency. The same logic that lets you call methods on an object's class also lets you call methods on its singleton class. When we define singleton methods on an object, they're added to its singleton class, and that class makes the methods available exclusively to that object.
00:01:56.560 So, we define a singleton method here and it'll live on our singleton class. When you call a method on an object, Ruby dispatches a message to that object, looking for that particular method. Ruby's first stop in its search for a method is always going to be the object's singleton class because that's what the object refers to first. As soon as Ruby finds a method with a matching name, it'll invoke it.
00:02:27.560 Now, what would happen if we got rid of that singleton method and defined a method by the same name on the class instead? If we call `my_method` on the object, Ruby will look on the singleton class first, but there's no method by that name there. Where can the method be found? Each class maintains a pointer to the next class that Ruby should look in for methods.
00:02:45.259 So, the `ancestors` method is your cheat sheet for understanding the places that Ruby is going to look for a given method. You can access that list of classes that Ruby is going to look on via the `ancestors` method. So here, we create an instance of my class, call `singleton_class` on that to get the singleton class, and call the `ancestors` method on that, getting this array in response.
00:03:01.680 Our singleton class is up first, and it's followed by my class. That's the place where Ruby is going to look next when it doesn't find that method on the singleton class. When Ruby fails to find `my_method` on the singleton class, it gets directed to the next class on the chain.
00:03:08.439 My class goes there, invokes the method, and we're good to go. Now suppose we were to move `my_method` again to a superclass of my class. If we define `my_method` up there on my superclass and say that my class is a subclass of that, we can create an instance of my class.
00:03:27.799 If we call `ancestors` on that singleton class again, we can see all the places that Ruby will look for `my_method`. It starts, of course, with the singleton class, proceeds to my class as before, and the new addition here is my superclass. We've got our output from the `ancestors` method up there, and you can see it exactly mirrors the places that Ruby looks for the method, starting with the singleton class, moving on to my class, and then to my superclass where it finds `my_method` and invokes it.
00:04:02.799 Now, what happens if a method appears in more than one place in the lookup chain? For example, if we define it on my superclass and then define it on my class as well, if a subclass method has the same name as a superclass method, it'll override that superclass method. Ruby will simply invoke the first method with a name matching the one it's looking for and stop. The `my_method` on my superclass never gets invoked.
00:04:24.560 But what if you want to call that overwritten method? You can just use the `super` keyword within the overwriting method. That'll cause Ruby to resume its search on the next class in the chain, so it proceeds to my superclass, finds `my_method` there, and invokes that as well.
00:04:51.599 Okay, slides are all well and good, but I find that doing stuff in a terminal makes things a little bit clearer. Let's play around with some classes from the terminal. Here we've got a superclass and a subclass that derives from it. Whoops, I skipped right past it. I'm sorry, it's a pre-recorded video. You all don't want to watch me type in real-time; trust me.
00:05:07.880 So, we've got a superclass with `my_method` on it that simply prints the class name, and then we've got a subclass where we'll override `my_method` and call `super` as well. We'll create an instance of my class, and then we're going to override `my_method` on the singleton class as well and call `super` there. We'll access the singleton class and print the ancestors list for it and then invoke the method.
00:05:43.560 So, if we drop back to a shell and invoke it, you see our ancestors list starts with the singleton class, proceeds to the subclass, and then the superclass follows that. It exactly reflects the order that the print statements follow when you call `my_method`: singleton class first, superclass invokes `my_method` on my subclass, and then it invokes `my_method` on my superclass.
00:06:19.199 Now, what would happen if you were to call a method that doesn't exist? First, Ruby will search the entire ancestors chain looking for the method you call. So, if we call a method named `non_existent` and pass it the arguments one and two, Ruby will proceed through the entire method lookup chain looking for that non-existent method.
00:06:54.760 But when it doesn't find it, it'll call another method named `method_missing`, and the search will start over in the singleton class. It'll proceed through the entire lookup chain, and there's a default version of `method_missing` on the basic object class which all Ruby classes inherit from, which raises the exception that you're probably used to seeing: a 'no method error'. But you can also override `method_missing` in your own classes.
00:07:36.200 So up here on my superclass, we override `method_missing`, taking three parameters: the name of the method that was invoked and two arguments. So first, Ruby will look for the method name that you called, the non-existent method, it will go right past that `method_missing` definition looking for it, and if it doesn't discover it, it'll start back over at the singleton class, looking for a method named `method_missing`. It'll find it there on my superclass and invoke it.
00:08:05.560 As I said, Ruby passes the name of the method that you attempted to call to it as the first argument, plus any arguments you called it with. Okay, let's take a look at class methods. You know class methods—these are methods you can call on the class itself without needing to create an instance of it first.
00:08:43.560 So up here, we define a class method on my class, saying `def self` and the name of the method we want to define instead of just `def` and the method name. Then we can invoke that without creating an instance of my class first. We just say `MyClass.class_method`. There are alternate ways to define class methods — you can use the class constant instead of `self` within the class body or define a class method outside of the class altogether, again using the class constant.
00:09:22.799 Now, let's take a moment and compare that class method declaration where we use the class constant to a singleton method declaration. They look pretty similar, right? The truth is, in Ruby, a class method is just a singleton method on the class; a class is just another object, an instance of the class class. Since the class is an object, it has a singleton class of its own.
00:09:53.160 That singleton class lets us define singleton methods on it. Of course, once you call a class method, Ruby will find it on the singleton class and invoke it. You can confirm for yourself that my class is an instance of class by calling `class` on it, and there you see the `Class` class in response. You can also call `singleton_class` on the class object if you want to take a look at that singleton class.
00:10:36.160 So how does this notation work, where you say `def self` and the class method name within the class body? Within the body of a class, Ruby sets `self` to point to that class, so `def self.class_method` is exactly equivalent to `def MyClass.class_method`, and the result is the same: a singleton method on the class that you can call.
00:11:01.760 What about methods defined at the top level within a Ruby source file? You can just declare methods and call them without surrounding them in a class or module, and you can call them immediately after defining them without needing to create instances of any class first. The way it works behind the scenes is pretty simple: top-level methods just get defined as private methods on the `Object` class.
00:12:00.560 Since all other classes inherit from `Object`, you can call methods defined at the top level from any instance method of any class. Here, we create my class, define `my_method` within it, and we could call that method defined at the top level from within that instance method. Ruby will start a search at the singleton class, proceed all the way up to the `Object` class where it finds `my_method` and invokes it.
00:12:45.840 But then how can you call the method at the top level? After all, what we're invoking here is a private method, and we're not within a class. Well, the secret is Ruby just sets `self` to an instance of the `Object` class when you're at the top level, meaning you don't have to define a recipient when you call the method.
00:13:06.200 You can see here that we print out `self`, and we get `main` in response, which, if we check the class of that object, the class is `Object`. At the top level, `self` gets set to an instance of `Object`, and since `self` is the implicit receiver of that method call and it's an instance of `Object`, the whole thing just works.
00:13:43.320 All right, so that's everything for classes. Now let's move on to modules, which work behind the scenes a lot like classes. You can define methods on them, and you can also use a module kind of like a superclass by mixing the module into a class.
00:14:17.160 Doing so will add the module to the ancestors list of the class. If we take a look at the ancestors of my class, you'll see my class comes up first, followed by my module, as if it were a superclass. In fact, internally, Ruby treats the module as if it were a class, which means the method lookup works the same with a mixin as with a superclass.
00:14:43.680 So we create an instance of my class down there and invoke `my_method` on it. When we create that instance, it creates a new singleton class. Ruby starts its search there at the singleton class, moves on to my class because that was the next in the ancestors list, and then moves on to my module where it finds `my_method` and invokes it.
00:15:41.840 Since mixins use the same lookup mechanism as classes, all the same rules apply. For example, method overriding: we can override `my_method` within my class, and Ruby will encounter that first when you invoke the method. The `my_method` on my module never gets invoked. `super` works as well: you can use `super` within the overwriting method in my class, and that will invoke `my_method` from my module, again just as if it was a superclass.
00:16:34.400 So you can overwrite a method from a module with a method from a class, but not vice versa — at least if you use `include`. So let's suppose that we had a method on my class that throws an exception, and you want to overwrite it using my module. Unfortunately, if you use `include` in my class, it's not going to work.
00:17:39.200 The reason is that the `include` method adds the module to the ancestors list after the class, meaning that `my_method` on my class overrides the method on the module, and that's what Ruby encounters first. It invokes that and throws an exception. We can confirm all of this if we call the `ancestors` method on my class — you'll see that my class is first in the list, followed by my module.
00:18:25.520 However, if you were to use the `prepend` method instead of `include` to mix the module in, that'll add it to the lookup chain before the class. Let's call it—let's use `prepend` and let's call ancestors again, and you can see the difference: my module appears first, followed by my class. `Prepend` allows the module method to override the class method.
00:19:10.360 So we use `prepend` here, which makes the module appear first in the lookup chain, causing my method on my module to override `my_method` from my class. It finds that first, invokes it, and no exception gets thrown.
00:19:58.960 All right, back to the console to play around with modules a little bit. So we've got two modules here: one of which we include into my class, and the other of which we prepend. We're going to define a method named `my_method` that simply prints the module name, we're going to define that same method within the prepended module but we'll use `super` there, and we'll define it on the class as well and use `super` there as well.
00:20:56.720 We'll create a new instance of my class, call `ancestors` on it, and then invoke `my_method`. If we drop back to the shell and run that, you'll see that once again our ancestors list matches the order of output when we call the method: the prepended module comes first, then the class, and finally the included module.
00:21:47.480 Okay, let's wrap up our tour of modules with refinements. The refinements feature was added in Ruby 2.0 and improved in Ruby 2.1. The goal of it, as hopefully you all saw in James Adam's talk earlier, was to make monkey patching safe again. Let's go with an example just to make things a little clearer: let's suppose you want to change the way that the `capitalize` method on string instances works.
00:22:48.420 So, let's say you've got a bunch of movie titles and you want them in title case, where each word is capitalized. Unfortunately, the default version of `capitalize` only capitalizes the first word. You can reopen the string class if you want; this sounds like a good idea, right? And redefine the `capitalize` method — that is, you monkey patch it.
00:23:11.840 And this works fantastically well in your movie title class. You can see that it takes the string 'The Matrix' there, all in lowercase, and capitalizes each word of it. Unfortunately, you may have other portions of your apps, such as this `Sentence` class down here, that's assuming you still have the original implementation of `capitalize`, which only capitalizes the first letter, so it doesn't work so well for sentences.
00:23:56.080 It's situations like this that refinements were created for. You can take your monkey patch from the string class and convert it to a refinement. I won't go into details on all the syntax of refinements; just Google 'Ruby refinements' and you'll find several excellent tutorials.
00:24:11.000 Basically, this is a refinement of the string class. A refinement is basically a module that preens an existing class, but only within lexical scopes—those are files, classes, or modules that explicitly activate it. It's a bit of a mouthful, so let's just see what it does in action.
00:25:12.720 Now that your refinement is defined, you can activate it within the `MovieTitle` class. That'll override the `capitalize` method, but only within `MovieTitle`. Outside of the movie title, in areas where we haven't explicitly activated the refinement, we still get the original version of `capitalize` and only the first letter of our string gets capitalized.
00:25:49.520 So now the version of `capitalize` you get depends on your starting point: where the method is called in a scope where refinements haven't been activated, the `Sentence` class, for example, it proceeds to the capitalized method on the string class, the original version, and invokes that. But in lexical scope, where the refinement has been activated, `MovieTitle` preens the refinement module to the string class, and that's the version of `capitalize` that Ruby encounters first.
00:26:38.640 So that's the version we'll invoke, and we get each word capitalized. All right, one more time back to the terminal to play around with refinements. So here we've got a class and a refinement of that class, and on my class, we'll define `my_method`, which will simply print the name of that class. We'll then override that method within the refinement, so we print 'my_refinement' and then call `super` because, since refinements are just a module, `super` works just like you're used to.
00:27:31.720 We'll create a new instance in my class and call `my_method`, and then we'll activate the refinements and call `my_method` again. All works fantastically well. We'll also put in a couple of markers so that we can tell whether the refinements are active or not.
00:28:25.240 Okay, so now we'll drop back to the shell and invoke it, and you can see that without refinements active, we get just my class all by itself. But after we activate the refinements using `my_refinement`, we get `my_refinement` first and then super invokes `my_method` on my class.
00:28:59.560 All right, just about done! I want to recap everything we've covered real quick. We'll have a few minutes for questions. I want to thank a few folks who have helped with this talk, and then I've got a few resources you can go to to learn more.
00:29:46.320 To recap, you can call the `ancestors` method on any Ruby class to get a list of its ancestors. Every Ruby object has a singleton class that Ruby will look on for methods first. If a class inherits from a superclass, that gets added to the ancestors chain right after the class.
00:30:50.040 If you include a module in the class, that will add the module to the ancestors chain between the class and its ancestors. If you prepend a module, that will add that module to the lookup chain prior to the class, allowing you to override methods.
00:31:26.360 If you have a refinement in lexical scopes where it's added, that gets prepended to the ancestors chain before the class it refines. You can define methods anywhere along this chain that you want. Ruby always starts its search for those methods at the singleton class and then proceeds along the chain looking for those methods.
00:32:23.200 As soon as it finds a method, it'll invoke it. If a method's defined at more than one point along the chain, Ruby will stop when it finds the first occurrence of that method and invoke that. That's how method overwriting is implemented.
00:33:09.720 All right, that's all I have. So what questions do you have?